LEARNING SPARSE MIXTURE MODELS FOR DISCRIMINATIVE CLASSIFICATION
Abstract
Recently Saul and Lee proposed a mixture model for discriminative classification of non-negative data via non-negative matrix factorization for feature extraction. In order to improve the generalization, this paper considers a sparse version of the model. The basic idea is to minimize the sum of the weights of un-normalized mixture models for posterior distributions according to regularization method. Experiments on CBCL face database and USPS digit data set assess the validity of the proposed approach.
References
- CBCL Face Database #1, MIT Center For Biological and Computation Learning, http://www.ai.mit.edu/projects/cbcl . Google Scholar
- Neural Comput. 14, 2791 (2002). Crossref, ISI, Google Scholar
-
R. O. Duda , P. E. Hart and D. G. Stork , Pattern Classification , 2nd edn. ( John Wiley & Sons , 2001 ) . Google Scholar -
K. Fukunaga , Introduction to Statistical Pattern Recognition , 2nd edn. ( Morgan Kaufmann , 1990 ) . Google Scholar P. O. Hoyer , Non-negative sparse coding, Neural Networks for Signal Processing XII, Proc. IEEE Workshop on Neural Networks for Signal Processing (2002) pp. 557–565. Google ScholarT. S. Jaakkola and D. Haussler , Advances in Neural Information Processing Systems (1999) pp. 487–493. Google Scholar-
A. Klautau , N. Jevtic and A. Orlitsky , Discriminative Gaussian mixture models: a comparison with Kernel classifiers , Proc. Twentieth Int. Conf. Machine Learning (ICML-2003) ( 2003 ) . Google Scholar D. Keysers , F. J. Och and H. Ney , Maximum entropy and Gaussian models for image object recognition, DAGM 2002, Pattern Recognition, 24th DAGM Symp.,Lecture Notes in Computer Science 2449 (Springer-Verlag, 2002) pp. 498–506. Google ScholarD. D. Lee and L. K. Saul , Advances in Neural Information Processing Systems (2001) pp. 556–562. Google Scholar- Nature 401, 788 (1999). Crossref, ISI, Google Scholar
-
S. Z. Li , X. W. Hou and H. J. Zhang , Learning spatially localized parts-based representation , Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition ( 2001 ) . Google Scholar W. Liu , N. Zheng and X. Lu , Non-negative matrix factorization for visual coding, Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (2003) pp. III_293–III_296. Google Scholar- D. J. C. Mackay, Bayesian methods for adaptive models, Ph.D. thesis, California Institute of Technology, USA (1991) . Google Scholar
- Synthese 117, 75 (1999). Crossref, ISI, Google Scholar
- Mach. Learn. 42(3), 287 (2001). Crossref, ISI, Google Scholar
-
S. Raudys , Statistical and Neural Classifiers: An Integrated Approach to Design ( Springer , London , 2001 ) . Crossref, Google Scholar Y. D. Rubinstein and T. Hastie , Discriminative vs informative learning, Proc. Third Int. Conf. Knowledge Discovery and Data Mining (1997) pp. 49–53. Google Scholar- W. S. Sarle, Neural Network FAQ, Part 3 of 7: Generalization, Periodic Posting to the Usenet Newsgroup Comp.ai.neural-nets, ftp://ftp.sas.com/pub/neural/FAQ.html (2001) . Google Scholar
-
L. K. Saul and D. D. Lee , Advances in Neural Information Processing Systems ( 2002 ) . Google Scholar - Speech Commun. 34, 287 (2001). Crossref, ISI, Google Scholar
- USPS data set, available on http://www.kernel-machines.org/data.html . Google Scholar
-
V. N. Vapnik , The Nature of Statistical Learning Theory , 2nd edn. ( Springer , NY , 2000 ) . Crossref, Google Scholar - Neural Comput. 1, 425 (1994). Google Scholar
- IEEE Trans. Patt. Anal. Mach. Intell. 24, 34 (2002). Crossref, ISI, Google Scholar
T. Zhang , Advances in Neural Information Processing Systems (2001) pp. 703–709. Google Scholar


