PARTIAL CORRELATION SCREENING FOR ESTIMATING LARGE PRECISION MATRICES, WITH APPLICATIONS TO CLASSIFICATION

Citation
Jiashun Jin et al., PARTIAL CORRELATION SCREENING FOR ESTIMATING LARGE PRECISION MATRICES, WITH APPLICATIONS TO CLASSIFICATION, Annals of statistics , 44(5), 2016, pp. 2018-2057
Journal title
ISSN journal
00905364
Volume
44
Issue
5
Year of publication
2016
Pages
2018 - 2057
Database
ACNP
SICI code
Abstract
Given n samples X., X.,...,Xn from N(0, .), we are interested in estimating the p . p precision matrix . = ..¹; we assume . is sparse in that each row has relatively few nonzeros. We propose Partial Correlation Screening (PCS) as a new row-by-row approach. To estimate the ith row of ., 1 . i . p, PCS uses a Screen step and a Clean step. In the Screen step, PCS recruits a (small) subset of indices using a stage-wise algorithm, where in each stage, the algorithm updates the set of recruited indices by adding the index j that has the largest empirical partial correlation (in magnitude) with i, given the set of indices recruited so far. In the Clean step, PCS reinvestigates all recruited indices, removes false positives and uses the resultant set of indices to reconstruct the ith row. PCS is computationally efficient and modest in memory use: to estimate a row of ., it only needs a few rows (determined sequentially) of the empirical covariance matrix. PCS is able to execute an estimation of a large . (e.g., p = 10K) in a few minutes. Higher Criticism Thresholding (HCT) is a recent classifier that enjoys optimality, but to exploit its full potential, we need a good estimate of .. Note that given an estimate of ., we can always combine it with HCT to build a classifier (e.g., HCT-PCS, HCT-glasso). We have applied HCT-PCS to two microarray data sets (p = 8K and 10K) for classification, where it not only significantly outperforms HCT-glasso, but also is competitive to the Support Vector Machine (SVM) and Random Forest (RF). These suggest that PCS gives more useful estimates of . than the glasso; we study this carefully and have gained some interesting insight. We show that in a broad context, PCS fully recovers the support of . and HCT-PCS is optimal in classification. Our theoretical study sheds interesting light on the behavior of stage-wise procedures.