school

UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text
Title

Scalable and memory-efficient sparse Bayesian learning for classification problems

English Abstract

Sparse Bayesian learning (SBL) shows competitive performance in accuracy, sparsity and probabilistic prediction for classification. In SBL, the regularization priors are automatically determined that avoids an exhaustive hyperparameter selection by cross- validation. However, in the literature, most of work on SBL mainly focuses on small problems and is not widely applicable in industry. This owes to two drawbacks. The first but the most critical one is its poor scalability to large problems because of the enormous computational intension for inversion of a potential covariance matrix for updating the regularization parameters in the SBL framework. The second one is SBL is originally proposed for regression and binary classifications. When tackling multiclass applications, some extra ensemble methods are required to combine the binary classifiers, making it a more corpulent model size with high training cost yet without probabilistic representations. This thesis address to resolve these two drawbacks of SBL. For the first issue, we investigate to develop new approximation rules for updating the regularization parameter, including i) an approximate SBL algorithm, where the regularization priors are approximated without relying on the covariance matrix. ii) two memory-efficient Quasi-newton methods are proposed to reduce the computational complexity of the MAP(maximum a posteriori) function in SBL. Our approaches reduce the high complexity of SBL from O(M^3) to O(M)(M: feature dimension), making it easily scale well to problems with large data size or feature dimension and can be implemented in parallel for big data. For the second issue of SBL, we develop a multinomial SBL for multiclass problems, reducing the large model size and execution time, relative to the ensemble methods and with probabilistic representations in prediction. We evaluate our proposed methods on both linear and non-linear SBL models using a variety of problems including large-scale datasets. Experiments have shown that the proposed models are with competitive accuracy and sparsity compared to existing methods while i) requiring thousands of times less memory; ii) easily scaling up to large scale problems; iii) sparser and more accurate for multiclass problems with probabilistic predictions.

Issue date

2021.

Author

Lou, Jia Hua

Faculty

Faculty of Science and Technology

Department

Department of Computer and Information Science

Degree

Ph.D.

Subject

Bayesian statistical decision theory

Supervisor

Vong, Chi Man

Files In This Item

Full-text (Intranet only)

Location
1/F Zone C
Library URL
991010069818006306