Monday, April 13, 2015

Dictionary Learning with Few Samples and Matrix Concentration

Here are some theoretical developments for sparse coding, an item discovered experimentally twenty years ago

Dictionary Learning with Few Samples and Matrix Concentration by Kyle Luh, Van Vu

Let A be an n×n matrix, X be an n×p matrix and Y=AX. A challenging and important problem in data analysis, motivated by dictionary learning and other practical problems, is to recover both A and X, given Y. Under normal circumstances, it is clear that this problem is underdetermined. However, in the case when X is sparse and random, Spielman, Wang and Wright showed that one can recover both A and X efficiently from Y with high probability, given that p (the number of samples) is sufficiently large. Their method works for p≥Cn2log2n and they conjectured that p≥Cnlogn suffices. The bound nlogn is sharp for an obvious information theoretical reason.
In this paper, we show that p≥Cnlog4n suffices, matching the conjectural bound up to a polylogarithmic factor. The core of our proof is a theorem concerning l1 concentration of random matrices, which is of independent interest.
Our proof of the concentration result is based on two ideas. The first is an economical way to apply the union bound. The second is a refined version of Bernstein's concentration inequality for the sum of independent variables. Both have nothing to do with random matrices and are applicable in general settings.
Image Credit: NASA/JPL-Caltech/Space Science Institute
N00237991.jpg was taken on April 11, 2015 and received on Earth April 12, 2015. The camera was pointing toward DIONE, and the image was taken using the CL1 and CL2 filters.
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: