College of Arts & Sciences
Statistical Learning & Bioinformatics
The Liu group is focused on developing statistical methodologies for general classification problems. With large-scale genomic data like those derived from DNA microarray analyses, one often faces the challenge of extracting meaningful information from a small number of samples tested for a large number of genes. Traditional statistical methods including Fisher discriminant analysis and nearest neighbor are often inadequate in such cases. Moreover, the number of classes can be large and consequently, binary machine-learning methods are not effective either. To solve such problems, Liu and colleagues have extended recent machine-learning techniques, psi-learning and support vector machines, to multicategory applications. Compared to support vector machines, psi-learning is more robust to outliers and generates more accurate classification results. To implement these techniques, the Liu group has also developed the necessary computational tools using differenced-convex programming to treat the non-convex minimization involved in psi-learning. These nonparametric-based methods have a wide range of applications including medical imaging, tumor classification, and cancer diagnosis/prognosis.
Another research interest is the issue of gene selection. Using cancer diagnosis as an example, the Liu group uses statistical methods to identify subsets of genes among thousands of candidate genes that are associated with specific cancers. This work will help guide future laboratory experiments in medical and pharmaceutical research toward better, more precise diagnoses. The overall goal is to obtain a classification model that can yield accurate classification of cancers using a relatively small number of genes. Various statistical methods that produce accurate estimation, classification, and model selection are being investigated.
Click images below for larger view of graphs..
Li Y, Liu Y, and Zhu J (2007). Quantile Regression in Reproducing Kernel Hilbert Spaces. Journal of the American Statistical Association 102(477):255-268.
Liu Y, Ruan S, and Dean, AM (2007). Construction and analysis of Es2 efficient supersaturated designs. Journal of Statistical Planning and Inference 137(5):1516-1529.
Liu Y and Wu Y (2006). Optimizing psi-learning via mixed integer programming. Statistica Sinica 16(2):441-457.
Liu Y and Shen X (2006). Multicategory psi-learning. Journal of the American Statistical Association 101(474):500-509.
Liu Y, Shen X, and Doss H (2005). Multicategory psi-learning and support vector machine: computational tools. Journal of Computational and Graphical Statistics 14(1): 219-236.