Nuit Blanche: Provable Bounds for Learning Some Deep Representations

Friday, November 21, 2014

Provable Bounds for Learning Some Deep Representations

Following up on this entry, here are some headways connecting the deep architectures of some neural networks with random matrix theory and some of the findings in compressive sensing. From the paper:

In fact our result that a single layer of our generative model is a sparse denoising autoencoder can be seen as an analog of the fact that random matrices are good for compressed sensing/sparse reconstruction (see Donoho (Donoho,2006) for general matrices and Berinde et al. (Berinde et al., 2008) for sparse matrices). Of course, in compressed sensing the matrix of edge weights is known whereas here it has to be learnt, which is the main contribution of our work. Furthermore, we show that our algorithm for learning a single layer of weights can be extended to do layerwise learning of the entire network.

Provable Bounds for Learning Some Deep Representations by Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an n node multilayer network that has degree at most n for some 1 and each edge has a random edge weight in [ -1 1 ] Our algorithm learns almost all networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It is based upon a novel idea of observing correlations among features and using these to infer the underlying edge structure via a global graph recovery procedure. The analysis of the algorithm reveals interesting structure of neural nets with random edge weights

By the way, what's up with 0.4 in Lemma 8, why not 42 ?

Relevant: