Friday, February 15, 2013

The Unbearable Lightness of Seeing

The title of this entry refers to the ever energy efficient human vision process and our ever so slow, energy hungry processes trying to imitate it. It's been on people's mind for a while now but it's officially a trend and this is just the beginning. Two papers recently showed up on the subject of relating our current understanding of the human visual process with current algorithms developed in either compressive sensing or machine learning in general. We've recently seen people trying to make a connection between fMRI activity and algorithm such as the scattering transform, but here one goes about trying to explain an optical illusion ( Lightness Illusion) through a compressive sensing perspective [1] and the other compares experimental data from Macaques and recent supervised machine learning algorithms [2]. From [2]: the conclusions:
Where are we today?
Under our analysis, we believe that the field has made significant advances with recent algorithms. On the intermediate level variation task (Medium Variation) these advances are quite evident: the recent representational learning algorithm in [Le et al., 2012] surpasses the representation in V4 and, surprisingly, the supervised algorithm of [Krizhevsky et al., 2012] matches the representation in IT. These advances are also evident on the high level variation task (High Variation): the [Le et al., 2012] algorithm is narrowly better than V4 and the [Krizhevsky et al., 2012] algorithm beats IT by an ample margin. It will be informative to measure the elements of these models that lead to this performance and it will be interesting to see if purely unsupervised algorithms can achieve similar performance.
A vision for the future
The methodology we have proposed here can be extended to other sensory domains where representation is critical and neural representations are thought to be effective. For example, it should be possible to define similar task protocols for auditory stimuli and measure the neural responses in auditory cortex. Such measurements would not only have implications for discovering effective auditory representations, but may also provide the data necessary to validate representational learning algorithms that are effective in multiple contexts. Representational learning algorithms that prove effective across these domains may serve as hypotheses for a canonical cortical algorithm, a ‘holy grail’ for artificial intelligence research.

Hat Tip to Andrej Karpathy for [2]
[1] Lightness illusion: A new look from Compressive Sensing perspective by Xinke Tang, Yi Li. The abstract reads:
Lightness illusions, such as the seemingly opposing effects of brightness contrast and assimilation, are characterized by visually perceived intensity images that differ from physical reality. Traditional hypotheses from signal processing community primarily use filtering to explain these phenomena. However, these methods may fail due to the change in geometry (e.g., homography transform). In this paper, we attempt to explain lightness illusion from a novel Compressive Sensing perspective. The underlying mathematics is based on the new theory of compressive sensing, which provides an efficient method for sampling and reconstructing a signal that is sparse in Fourier domain. The sampling amounts to a random sampling of locally averaged values. Reconstruction amounts to solving an underdetermined linear equation system using L1 norm minimization. The Accelerated Proximal Gradient (APG) method is used to reconstruct the compressed signal. We demonstrate that the reconstruction error can be used for robustly explaining well known lightness illusions.

A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines.

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: