Nuit Blanche: Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula / The Mutual Information in Random Linear Estimation

Monday, July 11, 2016

Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula / The Mutual Information in Random Linear Estimation

Jean mentioned to me his recent two preprints, the second preprint featuing an example of interest to compressive sensing, CDMA, error correction via sparse superposition codes [6], or Boolean group testing and more. From the first preprint:

"...From this point of view, the theorem proved in this paper is relevant in a broader context going beyond low-rank matrix estimation. Hundreds of papers have been published in statistics, machine learning or information theory using the non-rigorous statistical physics approach. We believe that our result helps setting a rigorous foundation of a broad line of work. While we focus on rank-one symmetric matrix estimation, our proof technique is readily extendable to more generic low-rank symmetric matrix or low-rank symmetric tensor estimation. We also believe that it can be extended to other problems of interest in machine learning and signal processing, such as generalized linear regression, features/dictionary learning, compressed sensing or multi-layer neural networks...."

Here are the two preprints: Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula by Jean Barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala, Thibault Lesieur, Lenka Zdeborova

Factorizing low-rank matrices has many applications in machine learning and statistics. For probabilistic models in the Bayes optimal setting, a general expression for the mutual information has been proposed using heuristic statistical physics computations, and proven in few specific cases. Here, we show how to rigorously prove the conjectured formula for the symmetric rank-one case. This allows to express the minimal mean-square-error and to characterize the detectability phase transitions in a large set of estimation problems ranging from community detection to sparse PCA. We also show that for a large set of parameters, an iterative algorithm called approximate message-passing is Bayes optimal. There exists, however, a gap between what currently known polynomial algorithms can do and what is expected information theoretically. Additionally, the proof technique has an interest of its own and exploits three essential ingredients: the interpolation method introduced in statistical physics by Guerra, the analysis of the approximate message-passing algorithm and the theory of spatial coupling and threshold saturation in coding. Our approach is generic and applicable to other open problems in statistical estimation where heuristic statistical physics predictions are available.

The Mutual Information in Random Linear Estimation by Jean Barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala

We consider the estimation of a signal from the knowledge of its noisy linear random Gaussian projections, a problem relevant in compressed sensing, sparse superposition codes or code division multiple access just to cite few. There has been a number of works considering the mutual information for this problem using the heuristic replica method from statistical physics. Here we put these considerations on a firm rigorous basis. First, we show, using a Guerra-type interpolation, that the replica formula yields an upper bound to the exact mutual information. Secondly, for many relevant practical cases, we present a converse lower bound via a method that uses spatial coupling, state evolution analysis and the I-MMSE theorem. This yields, in particular, a single letter formula for the mutual information and the minimal-mean-square error for random Gaussian linear estimation of all discrete bounded signals.