Friday, May 30, 2008

CS: Exact Matrix Completion by Semidefinite Programming or Who wants to be a millionaire ?

You thought you'd go home tonight for the week-end thinking you had read just about everything exciting about Compressed Sensing. It's a shame because Emmanuel Candès and Benjamin Recht just released Exact matrix completion by semidefinite programming [also here] The abstract reads:
We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys

for some positive numerical constant C, then with very high probability, most n x n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.

In the paper, they mention as potential uses the Netflix competition, the triangulation from incomplete data and the filling of a covariance matrix known to be of "low rank because the variables only depend upon a comparably smaller number of factors" (.i.e it should apply to many instances of machine learning). This utilization of the nuclear norm is not new, (see CS is not just Compressed Sampling nor Compressed Sensing) however it is different, how different ?

Building on the concept of restricted isometry introduced in [12] in the context of sparse signal recovery, [27] establishes the first sufficient conditions for which the nuclear norm heuristic returns the minimum rank element in the constraint set. They prove that the heuristic succeeds with large probability whenever the number m of available measurements is greater than a constant times 2nr log n for n x n matrices. Although this is an interesting result, a serious impediment to this approach is that one needs to essentially measure random projections of the unknown data matrix|a situation which unfortunately does not commonly arise in practice. Further, the measurements in (1.15) give some information about all the entries of M whereas in our problem, information about most of the entries is simply not available. In particular, the results and techniques introduced in [27] do not begin to address the matrix completion problem of interest to us in this paper. As a consequence, our methods are completely different; for example, they do not rely on any notions of restricted isometry. Instead, as we discuss below, we prove the existence of a Lagrange multiplier for the optimization (1.5) that certifies the unique optimal solution is precisely the matrix that we wish to recover.
The reference [27] is number [2] in the reference section below. In light of the discussion mentioned before, it is a good thing that the argument does not rely on an RIP argument and while it may look a little bit removed from Compressed Sensing, I can think of several instances where it may matter a great deal in the compressed sensing framework especially in the context of imaging detectors but I'll dwell on it later. The numerical experiments make use of the SDPT3 software package [1]. The phase transition observed in Compressed Sensing ( see Jared Tanner's presentation entitled Phase Transition Phenomenon in Sparse Approximation) also shows up in this work as well (as expected from [2])

Offline, Emmanuel tells me he has not tried to reconstruct the actual matrix of the Netflix challenge, maybe if you read this, you may have shot at becoming a millionaire over the week-end. Good luck.

[1] SDPT3 - a MATLAB software package for semidefinite-quadratic-linear programming K. C. Toh, R. H. Tütüncü, and M. J. Todd.

[2] Benjamin Recht, Maryam Fazel, and Pablo Parrilo, Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization

Credit: NASA / JPL / University of Arizona, Could it be ice just underneath the Phoenix Mars lander ?

No comments: