Wednesday, March 04, 2015

Sparse Coding: 20 years later

Today's entry is a witness to the lasting influence of a single finding made in 1995/1996 and featured in this paper:

Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images

Olshausen BA, Field DJ (1996).   Nature, 381: 607-609.  reprint (pdf)  |  abstract
The paper (and the one presented in 1995) has influenced many people having a "wow" moment and realizing that
  • the mammalian vision is doing sparse coding, 
  • elements of that coding looked like the wavelets found some years earlier 
and triggered much of the work in getting better imaging sensors but also compressive sensing and the idea that matrix factorization (like NMF found 4 years later), while NP-hard could be done easily (by the brain). Most of all, it continues on triggering questions on how we sense and how our brain does things like matrix factorization and related theoretical computer science questions such as algorithmic sample complexity. 

The legacy of this paper is so overwhelming that I personally wonder when (not if) Bruno and David are getting the Nobel prize or some similar prize for this finding ?

We have three papers today related to that one paper: 

First, a study of sparse coding in a special case (linearity). Mapping a matrix factorization solver to a linear neural network is a sign we live in the Great Convergence. We need similar running time and sample complexity bounds for AMP solvers

Simple, Efficient, and Neural Algorithms for Sparse Coding by Sanjeev Arora, Rong Ge, Tengyu Ma, Ankur Moitra

Sparse coding is a basic task in many fields including signal processing, neuroscience and machine learning where the goal is to learn a basis that enables a sparse representation of a given set of data, if one exists. Its standard formulation is as a non-convex optimization problem which is solved in practice by heuristics based on alternating minimization. Re- cent work has resulted in several algorithms for sparse coding with provable guarantees, but somewhat surprisingly these are outperformed by the simple alternating minimization heuristics. Here we give a general framework for understanding alternating minimization which we leverage to analyze existing heuristics and to design new ones also with provable guarantees. Some of these algorithms seem implementable on simple neural architectures, which was the original motivation of Olshausen and Field (1997a) in introducing sparse coding. We also give the first efficient algorithm for sparse coding that works almost up to the information theoretic limit for sparse recovery on incoherent dictionaries. All previous algorithms that approached or surpassed this limit run in time exponential in some natural parameter. Finally, our algorithms improve upon the sample complexity of existing approaches. We believe that our analysis framework will have applications in other settings where simple iterative algorithms are used.  
and an attempt at sparse coding and NMF using a simple yet nonlinear solver:

Olshausen and Field (OF) proposed that neural computations in the primary visual cortex (V1) can be partially modeled by sparse dictionary learning. By minimizing the regularized representation error they derived an online algorithm, which learns Gabor-filter receptive fields from a natural image ensemble in agreement with physiological experiments. Whereas the OF algorithm can be mapped onto the dynamics and synaptic plasticity in a single-layer neural network, the derived learning rule is nonlocal - the synaptic weight update depends on the activity of neurons other than just pre- and postsynaptic ones - and hence biologically implausible. Here, to overcome this problem, we derive sparse dictionary learning from a novel cost-function - a regularized error of the symmetric factorization of the input's similarity matrix. Our algorithm maps onto a neural network of the same architecture as OF but using only biologically plausible local learning rules. When trained on natural images our network learns Gabor-filter receptive fields and reproduces the correlation among synaptic weights hard-wired in the OF network. Therefore, online symmetric matrix factorization may serve as an algorithmic theory of neural computation. 

A Hebbian/Anti-Hebbian Network Derived from Online Non-Negative Matrix Factorization Can Cluster and Discover Sparse Features by Cengiz Pehlevan, Dmitri B. Chklovskii
Despite our extensive knowledge of biophysical properties of neurons, there is no commonly accepted algorithmic theory of neuronal function. Here we explore the hypothesis that single-layer neuronal networks perform online symmetric nonnegative matrix factorization (SNMF) of the similarity matrix of the streamed data. By starting with the SNMF cost function we derive an online algorithm, which can be implemented by a biologically plausible network with local learning rules. We demonstrate that such network performs soft clustering of the data as well as sparse feature discovery. The derived algorithm replicates many known aspects of sensory anatomy and biophysical properties of neurons including unipolar nature of neuronal activity and synaptic weights, local synaptic plasticity rules and the dependence of learning rate on cumulative neuronal activity. Thus, we make a step towards an algorithmic theory of neuronal function, which should facilitate large-scale neural circuit simulations and biologically inspired artificial intelligence.  

Disclaimer: I know neither Bruno nor David
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: