Nuit Blanche: Sunday Morning Insight: Phase transitions and sample complexity in Bayes-optimal matrix factorization

Saturday, February 15, 2014

Sunday Morning Insight: Phase transitions and sample complexity in Bayes-optimal matrix factorization

We mentioned it before. Some of the advanced matrix factorizations now on the market have the particularity of having sharp phase transitions. This is a reflection that each of these problems are NP-Hard and that a relaxation on the assumptions of the unknowns can yield polynomial complexity algorithms as was the case of compressive sensing with the Donoho-Tanner phase transitions and attendant developments. The authors of the following paper try to devise a similar limit to what is achievable for generic matrix factorizations using Bayes optimal inference. Lenka Zdeborova's presentation at the latest Paris Machine Learning meetup on how to find a needle in a haystack? was in particular focused on clustering, a particular matrix factorization.

Phase transitions and sample complexity in Bayes-optimal matrix factorization by Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard, Ayaka Sakata, Lenka Zdeborová

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

Join the CompressiveSensing subreddit or the Google+ Community and post there !