Determining and Depicting Relationships Among Components in High-Dimensional Variable Selection

Citation
Hall, Peter et Miller, Hugh, Determining and Depicting Relationships Among Components in High-Dimensional Variable Selection, Journal of computational and graphical statistics , 20(4), 2011, pp. 988-1006
ISSN journal
10618600
Volume
20
Issue
4
Year of publication
2011
Pages
988 - 1006
Database
ACNP
SICI code
Abstract
Many modern treatments of high-dimensional datasets involve reducing the initial collection of features to a much smaller set, from which a predictive model may be built. However, strong relationships between the remaining variables can limit the parsimony or even the predictive performance of such a model. We propose a semi-automatic approach using generalized correlation to detect and quantify these relationships, as well as exploring ways to represent this information graphically. The method can detect both symmetric and asymmetric relationships, as well as nonlinear patterns. Its utility is demonstrated on a range of real and simulated datasets. Supplemental material for performing the real-data analyses in this article is available online.