Nuit Blanche: Confidence Intervals and Hypothesis Testing for High-Dimensional Regression

Monday, April 14, 2014

Confidence Intervals and Hypothesis Testing for High-Dimensional Regression - implementation -

As we were discussing the release of an implementation of the Sparse PCA algorithm, Andrea mentioned to me the release of an R program for hypothesis testing with the Lasso:

Conﬁdence Intervals and Hypothesis Testing for High-Dimensional Regression by Adel Javanmard, Andrea Montanari

Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely challenging to quantify the uncertainty associated with a certain parameter estimate. Concretely, no commonly accepted procedure exists for computing classical measures of uncertainty and statistical signiﬁcance as conﬁdence intervals or p-values for these models. We consider here high-dimensional linear regression problem, and propose an eﬃcient algorithm for constructing conﬁdence intervals and p-values. The resulting conﬁdence intervals have nearly optimal size. When testing for the null hypothesis that a certain parameter is vanishing, our method has nearly optimal power. Our approach is based on constructing a ‘de-biased’ version of regularized M-estimators. The new construction improves over recent work in the ﬁeld in that it does not assume a special structure on the design matrix. We test our method on synthetic data and a high-throughput genomic data set about riboﬂavin production rate, made publicly available by [BKM14].

A web page, maintained by Hamid Javadi, Adel Javanmard, Andrea Montanari and Sven Schmit, featuring other papers and an implementation in R is here.

Join the CompressiveSensing subreddit or the Google+ Community and post there !