Thursday, February 07, 2008

NIPS 2007 Tutorial videos: Visual Recognition in Primates and Machines and Sensory Coding and Hierarchical Representations


Some of the NIPS tutorials are out. Of interest are that of Tomaso Poggio who with Thomas Serre has devised a feedforward model of the visual cortex and that of Michael Lewicki on Sensory Coding and Hierarchical Representations when looked in relation to this entry. In the first presentation I note an interesting paper that begins the work of defining a norm based on the hierarchical structure underlying the visual cortex model of Poggio and Serre. It is entitled Derived Distance: towards a mathematical theory of visual cortex by Steve Smale, Tomaso Poggio, Andrea Caponnetto and Jake Bouvrie. And as we have begun to learn, never underestimate when a Fields medalist write about things that we little people can understand. I need to come back to this later.

Visual Recognition in Primates and Machines by Tomaso Poggio. The slides are here. The abstract of the presentation reads:
Understanding the processing of information in our cortex is a significant part of understanding how the brain works and of understanding intelligence itself, arguably one of the greatest problems in science today. In particular, our visual abilities are computationally amazing and we are still far from imitating them with computers. Thus, visual cortex may well be a good proxy for the rest of the cortex and indeed for intelligence itself. But despite enormous progress in the physiology and anatomy of the visual cortex, our understanding of the underlying computations remains fragmentary. I will briefly review the anatomy and the physiology of primate visual cortex and then describe a class of quantitative models of the ventral stream for object recognition, which, heavily constrained by physiology and biophysics, have been developed during the last two decades and which have been recently shown to be quite successful in explaining several physiological data across different visual areas. I will discuss their performance and architecture from the point of view of state-of-the-art computer vision system. Surprisingly, such models also mimic the level of human performance in difficult rapid image categorization tasks in which human vision is forced to operate in a feedforward mode. I will then focus on the key limitations of such hierarchical feedforward models for object recognition, discuss why they are incomplete models of vision and suggest possible alternatives focusing on the computational role of attention and its likely substrate – cortical backprojections. Finally, I will outline a program of research to attack the broad challenge of understanding in terms of brain circuits the process of image inference and in particular recognition tasks beyond simple scene classification.

  • Flash Movie Session A

  • Flash Movie Session B



    Sensory Coding and Hierarchical Representations by Michael Lewicki. The slides are here. The abstract description reads:

    The sensory and perceptual capabilities of biological organisms are still well beyond what we have been able to emulate with machines, and the brain devotes far more neural resources to the problems of sensory coding and early perception than we give credit in our algorithms. What is it all doing? Although a great deal has been learned about anatomical structure and physiological properties, insights into the underlying information processing algorithms have been difficult to obtain. Recent work, however, that has begun to elucidate some of the underlying computational principles and processes that biology uses to transform the raw sensory signal into a hierarchy of representations that subserve higher-level perceptual tasks. A central hypothesis in this work is that biological representations are optimal from the viewpoint of statistical information processing, and adapt to the statistics of the natural sensory environment. In this tutorial, I will review work on learning sensory codes that are optimal for the statistics of the natural sensory environment and show how these results provide theoretical explanations for a variety of physiological data in both the auditory and visual systems. This will include work that that has extended these results to provide functional explanations for many non-linear aspects of early auditory and visual processing. I will focus on work on the auditory and visual systems but also emphasize the generality of these approaches and how they can be applied to any sensory domain. I will also discuss work that generalizes the basic theory and shows how neural representations optimally compensate for sensory distortion and noise in neural populations. Finally, I will review work that goes beyond sensory coding and investigates the computational problems involved in computing more abstract sensory properties and invariant features that can subserve higher-level tasks such as perceptual organization and analysis of complex, natural scenes.

  • Flash Movie Session A
  • Flash Movie Session B
  • No comments:

    Printfriendly