A method for constructing a confidence bound for the actual error rate of a prediction rule in high dimensions

Citation
K. Dobbin, Kevin, A method for constructing a confidence bound for the actual error rate of a prediction rule in high dimensions, Biostatistics (Oxford. Print) , 10(2), 2009, pp. 282-296
ISSN journal
14654644
Volume
10
Issue
2
Year of publication
2009
Pages
282 - 296
Database
ACNP
SICI code
Abstract
Constructing a confidence interval for the actual, conditional error rate of a prediction rule from multivariate data is problematic because this error rate is not a population parameter in the traditional sense.it is a functional of the training set.When the training set changes, so does this 'parameter'.A valid method for constructing confidence intervals for the actual error rate had been previously developed by McLachlan.However, McLachlan's method cannot be applied in many cancer research settings because it requires the number of samples to be much larger than the number of dimensions (n >> p), and it assumes that no dimension-reducing feature selection step is performed.Here, an alternative to McLachlan's method is presented that can be applied when p >> n, with an additional adjustment in the presence of feature selection.Coverage probabilities of the new method are shown to be nominal or conservative over a wide range of scenarios.The new method is relatively simple to implement and not computationally burdensome.