Nuit Blanche: Sparse Matrix Factorization: Simple rules for growing neural nets and Provable Bounds for Learning Some Deep Representations

Thursday, January 23, 2014

Sparse Matrix Factorization: Simple rules for growing neural nets and Provable Bounds for Learning Some Deep Representations

I cannot tell you how much I **love** the following two papers. Why ? because using Advanced Matrix Factorization one can foresee a much more direct understanding of the whole Panorama of Sensing while allowing these models to provide information Faster Than a Blink of an Eye. Furthermore, it paves the way to naturally see how phase transitions put a limit on these contructions in a rational way (Sunday Morning Insight: The Map Makers). Think of these papers as gateway papers. Wow.

Sparse Matrix Factorization by Behnam Neyshabur, Rina Panigrahy

We investigate the problem of factorizing a matrix into several sparse matrices and propose an algorithm for this under randomness and sparsity assumptions. This problem can be viewed as a simplification of the deep learning problem where finding a factorization corresponds to finding edges in different layers and values of hidden units. We prove that under certain assumptions for a sparse linear deep network with n nodes in each layer, our algorithm is able to recover the structure of the network and values of top layer hidden units for depths up to O~(n1/6). We further discuss the relation among sparse matrix factorization, deep learning, sparse recovery and dictionary learning.

The attendant slides are here. Of interest is the following slide for the hardware pêople

Highly relevant:
Provable Bounds for Learning Some Deep Representations by Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an n node multilayer neural net that has degree at most n for some \gamma less than 1 and each edge has a random edge weight in [-1; 1]. Our algorithm learns almost all networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.

see also Rong Ge's PhD thesis: Provable Algorithms for Machine Learning Problems and

Join the CompressiveSensing subreddit or the Google+ Community and post there !