APPLICATIONS OF SUPPORT VECTOR MACHINES TO CANCER CLASSIFICATION WITH MICROARRAY DATA
Abstract
Microarray gene expression data usually have a large number of dimensions, e.g., over ten thousand genes, and a small number of samples, e.g., a few tens of patients. In this paper, we use the support vector machine (SVM) for cancer classification with microarray data. Dimensionality reduction methods, such as principal components analysis (PCA), class-separability measure, Fisher ratio, and t-test, are used for gene selection. A voting scheme is then employed to do multi-group classification by k(k - 1) binary SVMs. We are able to obtain the same classification accuracy but with much fewer features compared to other published results.
References
- Nature 403, 503 (2000). Crossref, Medline, ISI, Google Scholar
B. Boser , I. Guyon and V. N. Vapnik , A training algorithm for optimal margin classifiers, Fifth Annual Workshop on Computational Learning Theory (ACM Press, 1992) pp. 144–152. Google Scholar- Bioinformatics 19, 1252 (2003). Crossref, Medline, ISI, Google Scholar
- Molecular Biology of Cell 13, 1929 (2002). Crossref, Medline, ISI, Google Scholar
- Machine Learning 20, 273 (1995). ISI, Google Scholar
- IEEE Trans. Electronic Computers EC 14, 326 (1965). ISI, Google Scholar
- Bioinformatics 19, 45 (2003). Crossref, Medline, ISI, Google Scholar
- J. Am. Stat. Assoc. 97, 77 (2002). Crossref, ISI, Google Scholar
-
R. N. Fletcher , Practical Methods of Optimization , 2nd edn. ( Wiley , New York , 1987 ) . Google Scholar - Science 286, 531 (1999). Crossref, Medline, ISI, Google Scholar
- , Advances in Neural Information Processing Systems 10 , eds.
M. I. Jordan , M. J. Kearnsa and S. A. Solla ( MIT Press , 1998 ) . Google Scholar - , Neurocomputing: Algorithms, Architectures and Applications , ed.
J. Fogelman ( Springer-Verlag , 1990 ) . Crossref, Google Scholar - Nature Medicine 7, 673 (2001). Crossref, Medline, ISI, Google Scholar
- Proc. Natl. Acad. Sci. USA 100, 5974 (2003). Crossref, Medline, ISI, Google Scholar
- IEEE Trans. on Neural Networks 13, 1211 (2002). Crossref, Medline, ISI, Google Scholar
- Science 267, 467 (1995). Google Scholar
-
H. Simon , Neural Networks: A Comprehensive Foundation , 2nd edn. ( Prentice-Hall Inc. , New Jersey , 1999 ) . Google Scholar - Proc. Natl. Acad. Sci. USA 99, 6567 (2002). Crossref, Medline, ISI, Google Scholar
- Statistical Science 18, 104 (2003). Crossref, ISI, Google Scholar
- Bioinformatics 17, 520 (2001). Crossref, Medline, ISI, Google Scholar
- Proc. Natl. Acad. Sci. USA 98, 5116 (2001). Crossref, Medline, ISI, Google Scholar
-
V. N. Vapnik , Statistical Learning Theory ( Wiley , New York , 1998 ) . Google Scholar - Biomethika 34, 28 (1947). ISI, Google Scholar



