Monday, January 15, 2018

The stochastic interpolation method. The layered Structure of Tensor Estimation and Phase Transitions, Optimal Errors and Optimality of Message-Passing in GLMs

Jean just sent me the following:
Dear Igor,

We have of bunch of recent rigorous results that might be of interest for the community. In order to obtain them, I developed together with Nicolas Macris a new adaptive interpolation method well designed for treating high-dimensironl Bayesian inference problems.

_In this article we present our method with application to random linear estimation/compressive sensing as well as to symmetric low-rank matrix factorization and tensor factorization.

_In this one, presented at allerton this year, we present a nice application of the method to the non-symmetric tensor factorization problem that was resisting until now. Moreover we exploit the structure in « layers » of the model, which might be an idea of independent interest.

_Finally, our main recent result is the application of the method for proving the statistical physics conjecture for the single-letter « replica formula » for the mutual information of generalized linear models. There we also rigorously derive the inference and generalization errors of a large class of single layer neural networks such as the perceptron.

All the best

Thanks Jean, here are the papers you mentioned. I had mentioned one before and I like the layer approach of the second paper !

In recent years important progress has been achieved towards proving the validity of the replica predictions for the (asymptotic) mutual information (or "free energy") in Bayesian inference problems. The proof techniques that have emerged appear to be quite general, despite they have been worked out on a case-by-case basis. Unfortunately, a common point between all these schemes is their relatively high level of technicality. We present a new proof scheme that is quite straightforward with respect to the previous ones. We call it the stochastic interpolation method because it can be seen as an extension of the interpolation method developped by Guerra and Toninelli in the context of spin glasses, with a trial "parameter" which becomes a stochastic process. In order to illustrate our method we show how to prove the replica formula for three non-trivial inference problems. The first one is symmetric rank-one matrix estimation (or factorisation), which is the simplest problem considered here and the one for which the method is presented in full details. Then we generalize to symmetric tensor estimation and random linear estimation. In addition, we show how to prove a tight lower bound for the mutual information of non-symmetric tensor estimation. We believe that the present method has a much wider range of applicability and also sheds new insights on the reasons for the validity of replica formulas in Bayesian inference.

We consider rank-one non-symmetric tensor esti- mation and derive simple formulas for the mutual information. We start by the order 2 problem, namely matrix factorization. We treat it completely in a simpler fashion than previous proofs using a new type of interpolation method developed in [1]. We then show how to harness the structure in "layers" of tensor estimation in order to obtain a formula for the mutual information for the order 3 problem from the knowledge of the formula for the order 2 problem, still using the same kind of interpolation. Our proof technique straightforwardly general- izes and allows to rigorously obtain the mutual information at any order in a recursive way.

We consider generalized linear models (GLMs) where an unknown n-dimensional signal vector is observed through the application of a random matrix and a non-linear (possibly probabilistic) componentwise output function. We consider the models in the high-dimensional limit, where the observation consists of m points, and m/n→α where α stays finite in the limit m,n→∞. This situation is ubiquitous in applications ranging from supervised machine learning to signal processing. A substantial amount of theoretical work analyzed the model-case when the observation matrix has i.i.d. elements and the components of the ground-truth signal are taken independently from some known distribution. While statistical physics provided number of explicit conjectures for special cases of this model, results existing for non-linear output functions were so far non-rigorous. At the same time GLMs with non-linear output functions are used as a basic building block of powerful multilayer feedforward neural networks. Therefore rigorously establishing the formulas conjectured for the mutual information is a key open problem that we solve in this paper. We also provide an explicit asymptotic formula for the optimal generalization error, and confirm the prediction of phase transitions in GLMs. Analyzing the resulting formulas for several non-linear output functions, including the rectified linear unit or modulus functions, we obtain quantitative descriptions of information-theoretic limitations of high-dimensional inference. Our proof technique relies on a new version of the interpolation method with an adaptive interpolation path and is of independent interest. Furthermore we show that a polynomial-time algorithm referred to as generalized approximate message-passing reaches the optimal generalization error for a large set of parameters.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: