Saturday, March 24, 2007

Driving on a manifold: unveiling our strategy

Our entry in DARPA Urban Challenge will feature compressed sensing as a way to reduce the dimensionality of our vision sensor. We will then have to infer the connection between our GPS track (RNDF and MDF) and the reduced parameters obtained from random projections.

Thursday, March 22, 2007

Compressed Sensing, Primary Visual Cortex, Dimensionality Reduction, Manifolds and Autism

In a previous entry, I mentioned the potential connection between compressed sensing, the primary cortex and cognition deficit diseases like autism without much explanation. Here is an attempt at filling the holes.

When David Field and Bruno Olshausen showed that the primary cortex was getting inputs from our eyes as a set of sparse functions that looked like ridgelets and curvelets, it became obvious that one result was missing: If natural images are sparse and our eye system has sparse receptors, is there a way our brain finds a sparse decomposition of the world in a way that works in a linear fashion? The thinking goes that our brain is really capable of understanding scenes without an iteration process (an iteration process is nonlinear and has a high cost in terms of energy). When Emmanuel Candes and David Donoho showed that in fact, non-adaptive schemes using curvelets could decompose natural images it became obvious that a good parallel could be made between the physiology of the primary cortex and this new type of decomposition. But how do you do this decomposition ? While an m-term curvelet expansion of a scene can be thresholded and can rival with complex adaptive approximation schemes, it does not answer how the primary cortex eventually comes up with that m number.

The state of the art on our thinking about the primary cortex can be found here in this review by Graham and Field on sparse coding in the neocortex. It specifically addresses the bounds on the primary cortex induced by the metabolic constraints:

We conclude our discussion by returning to the issue of metabolic constraints. Could we argue that primary evolutionary pressure driving towards sparse coding is one related to the metabolic costs of neural firing? As noted earlier, both Attwell and Laughlin (2001) and Lennie (2003) argue that there are not enough resources to achieve anything but a low-activity system. Moreover, when we find sparse activity in frontal cortex (Abeles et al., 1990), it is more difficult to argue that the sparse activity must arise because it is mapping the sparse structure of the world. Even at early levels, if sparseness were metabolically desirable, there are a number of ways of achieving sparseness without matching the structure of the world. Any one of a wide variety of positively accelerating nonlinearities would do. Simply giving the neurons a very high threshold would achieve a sparse code, but the system would lose information. We argue that the form of sparse coding found in sensory systems is useful because such codes maintain the information in the environment, but do so more efficiently. We argue that the evolutionary pressure to move the system towards a sparse code comes from the representational power of sparse codes. However, we do accept that metabolic constraints are quite important. It has been demonstrated that at the earliest levels of the visual system, ganglion cells (Berry, Warland and Meister, 1997) and lateral geniculate nucleus cells (Reinagel and Reid, 2000) show sparse (non-Gaussian) responses to temporal noise. A linear code, no matter how efficiently it was designed would not show such sparse activity, so we must assume that the sparseness is at least in part due to the nonlinearities in the system and not due to the match between the receptive fields and the sparse structure of the input. As with the results show sparse responses in non-sensory areas, we must accept that metabolic may also be playing a significant role.

Well it's nice to acknowledge we have physical limitations, but to assume that a linear code cannot exist simply because we currently do not have a model for it, is probably assuming too much. So what do we know ? the primary cortex is a low energy system which basically removes from consideration any complex system that requires resources (like an iteration system). This situation favors a linear system but so far, we have not found a good model for that. There is something deeper still. Even if we knew much about the sparsity of a scene, understanding the brain is really about understanding how large amount of information is reduced when traveling from the eye into the brain. In other words, we need to reduce the amount of data that our megapixel sensor called the eye is bringing in, and we must do this very fast (30 times a second). To put things in perspective, let us take an example: let us imagine we are seeing a scene where somebody waves a hand. If we were to take a video of this scene, we would probably get a 40 MB avi file (uncompressed). That file could then be compressed to 1 MB using MPEG for instance. While the compression is impressive, it is not impressive enough. In effect, when our brain sees this video, it can remember the hand and how it moved. The movement is probably two dimensional and so the brain really remembers the two parameters needed to produce a hand that waves in the manner that is shown in the video. In other words, the brain is probably not storing 1 MB of information when it stores this hand waving activity, it is most probably storing how two parameters changed over time, which in many occasion is much less than 1 MB of data. The real question becomes: Is there a way to reduce that the 1MB information further ? We are not asking ourselves if the input or the receptor are sparse (it is a necessary condition), we are interested in answering the question on how sparser we can make this information by using the connection between these sparse elements. Can we reduce the dimensionality of the signal further and exploit it ?

Enters Dimensionality reduction: Ever since the discovery of dimensionality reduction schemes (LLE, Isomap, Laplacian-diffusion..) that are taking high dimensionality data and and are able to map them into low dimensionality manifolds, researchers have been trying to extend these techniques to wider sets of problems. In the cognition world for instance, Jonathan Pillow and Eero Simoncelli perform dimensionality reduction applied to neural models but it is not obvious how these techniques directly translate into a specific functionality of the primary cortex even if they take an example of a V1 cell. It is also not obvious how some of these techniques are robust to noise. But as stated earlier, there are different ways to go about dimensionality reduction. One of the most intriguing which has robustness built into it is Compressed Sensing. Compressed Sensing has the ability to produce a robust decomposition of a manifold. Mike Wakin looked into that during his dissertation and found that smooth manifolds can be readily compressed using Compressed Sensing thereby making it a very simple solution to dimensionality reduction (see R. G. Baraniuk and M. B. Wakin in Random Projections of Smooth Manifolds ) but as Donoho and Grimes had shown earlier, sharp objects such as arms, legs have edges and that makes the manifold non-differentiable. This is a problem because it means that one cannot easily extract parameters from a video if we have these sharp edges. In order take care of that problem, one can be inspired by Biology i.e. to smooth these images using Gabor wavelets as in the human vision system (Object Recognition with Features Inspired by Visual Cortex by Thomas Serre, Lior Wolf and Tomaso Poggio) and then use the random projection of smooth manifolds to eventually figure out the parameters of the movements ( for more information on how to do that see High-resolution navigation on non-differentiable image manifolds or the Multiscale Structure of Non-Differentiable Image Manifolds)

[in Object Recognition with Features Inspired by Visual Cortex by Thomas Serre, Lior Wolf, Tomaso Poggio one may note that one can only be struck by the pain the algorithm goes through into in order to be robust.

  • S1: Apply a battery of Gabor filters to the input image. The filters come in 4 orientations θ and 16 scales s (see Table 1). Obtain 16×4 = 64 maps (S1)sθ that are arranged in 8 bands (e.g., band 1 contains filter outputs of size 7 and 9, in all four orientations, band 2 contains filter outputs of size 11 and 13, etc).
  • C1: For each band, take the max over scales and positions: each band member is sub-sampled by taking the max over a grid with cells of size NΣ first and the max between the two scale members second, e.g., for band 1, a spatial max is taken over an 8 ×8 grid first and then across the two scales (size 7 and 9). Note that we do not take a max over different orientations, hence, each band (C1)Σcontains 4 maps.
  • During training only: Extract K patches Pi=1,...K of various sizes ni × ni and all four orientations (thus containing ni × ni × 4 elements) at random from the (C1)Σ maps from all training images.
  • S2: For each C1 image (C1)Σ, compute: Y = exp(−γ||X − Pi||2) for all image patches X (at all positions) and each patch P learned during training for each band independently. Obtain S2 maps (S2)Σi .
  • C2: Compute the max over all positions and scales for each S2 map type (S2)i (i.e., corresponding to a particular patch Pi) and obtain shift- and scale-invariant C2 features (C2)i , for i = 1 . . .K.


]
Besides Wakin, Donoho, Baraniuk and others collaborating with them, few have made that connection. Yves Meyer makes a passing reference to the use of compressed sensing in physiology (in Perception et compression des images fixes.) Gabriel Peyre however is a little more specific here.
Analogies in physiology:
This compressed sampling strategy could potentially lead to interesting models for various sensing operations performed biologically. Skarda and Freeman have proposed a non-linear chaotic dynamic to explain the analysis of sensory inputs. This chaotic state of the brain ensures robustness toward unknown events and unreliable measurements, without using too many computing resources. While the theory of compressed sensing is presented here as a random acquisition process, its extension to deterministic or dynamic settings is a fascinating area for future research in signal processing.
I am mentioning Gabriel Peyre's work because he works on bandelets. I had mentioned bandelets before in the context of an announcement made by the company Let it wave (headed by Stephane Mallat ) where they showed that faces could be compressed down to 500 bytes or the size of a bar code.



If bandelets provide a recognizable Faces nonlinearly with 500 bytes, one only needs 5 x 500 bytes = 2.5 KB random samples (within the meaning of Compressed sensing) of that Face to be able to reconstruct it. 2.5KB is better than 10 MP or 1 MB. The number 5 is the bound for compressed sensing (more specific asymptotic laws/results can be found in the summary of Terry Tao)

However, the strongest result so far is the one found by Mike Wakin on random projection on a manifold where he uses neighborhood criteria in the compressed sensing space to permit an even better reconstruction than just assuming sparsity. In the figure below, one can compare the 5 random projection results using the Manifold based recovery from the traditional algorithms used like Orthogonal Matching Pursuit and Basis Pursuit.



Naturally, an extension of this is target detection as featured in the Smashed Filter article from Rice. With this type of result/framework, I am betting we can go lower than 2.5 KB of universal samples to characterize a face. The connection between cognition and compressed sensing and autism is simple: Face processing seem deficient for people affected with autism and we don't know why. A model based on the elements I just mentioned might give an insight into this issue.

Tuesday, March 20, 2007

Traduction de Compressed Sensing en Francais

Apres en avoir parle avec Emmanuel Candes, il semble qu'une traduction francaise judicieuse de "Compressed Sensing" pourrait etre: Acquisition Comprimée

After talking to Emmanuel Candes, a good french translation of "Compressed Sensing" should probably be: Acquisition Comprimée.

It looks as though "Compressed Sensing" and "Compressive Sampling" have been used for the same technique. Now a new term has surfaced "Compressive Sensing".

Compressed Sensing video presentations

It is one thing to read papers and presentations, it is also interesting to watch video presentations of the compressed sensing (or compressive sampling) idea. Richard Baraniuk has two:
If you know of another one, please drop me a line.

[ Update October 2007: there is a flurry of them here in this entry: Compressed Sensing Videos: Where Is My Popcorn ? ]
[Update August 2008: I have gathered most videos on the subject of Compressive Sensing on this
CS Videos site ]

Saturday, March 17, 2007

l1_ls: Simple Matlab Solver for l1-regularized Least Squares Problems (compressed sensing)

Make that four codes available to perform reconstruction in the compressed sensing setting. Kwangmoo Koh, Seung-Jean Kim, and Stephen Boyd just made available l1_ls. According to the authors:
l1_ls: Simple Matlab Solver for l1-regularized Least Squares Problems


l1_ls is a Matlab implementation of the interior-point method for l1-regularized least squares described in the paper, A Method for Large-Scale l1-Regularized Least Squares Problems with Applications in Signal Processing and Statistics. l1_ls solves an optimization problem of the form

\[ \begin{array}{ll}\mbox{minimize} & \|Ax-y\|_2^2+\lambda\|x\|_1 \end{array} \] ,

where the variable is $x\in\mathbf{R}^{n}$ and the problem data are $A\in\mathbf{R}^{m\times n}$, $y\in\mathbf{R}^{m}$ and $\lambda\in\mathbf{R}_{+}^{n}$.

The solver l1_ls is developed for large problems. It can solve large sparse problems with a million variables with high accuracy in a few tens of minutes on a PC. It can also efficiently solve very large dense problems, that arise in sparse signal recovery with orthogonal transforms, by exploiting fast algorithms for these transforms.

Thursday, March 15, 2007

A beautiful mind can save your life

Terry Tao, the Fields medalist and the prodigy who developed compressed sensing with Emmanuel Candes and David Donoho, has an autistic brother:
The Taos had different challenges in raising their other two sons, although all three excelled in math. Trevor, two years younger than Terry, is autistic with top-level chess skills and the musical savant gift to play back on the piano a musical piece — even one played by an entire orchestra — after hearing it just once. He completed a Ph.D. in mathematics and now works for the Defense Science and Technology Organization in Australia.
The youngest, Nigel, told his father that he was “not another Terry,” and his parents let him learn at a less accelerated pace. Nigel, with degrees in economics, math and computer science, now works as a computer engineer for Google Australia.

This is interesting in many ways. A particular way to look at it, is that Terry has provided a technique (Compressed Sensing) that has the potential to revolutionize the way people model the primary cortex: i.e how the brain decomposes, processes and understands information. A trait of autism is the inability to treat correctly information and subsequently act on it. This type of story where somebody discovers something that eventually he is affected by afterwards, reminds me of a story Glenn Seaborg told us when he came to campus six months before he passed away. During his talk, he reminded us that some colleagues of his came to his lab once and asked if he could find a radioactive element that had a specific half life that was not too long and not too short. Seaborg obliged and discovered Iodine-131. According to this entry on wikipedia:
Livingood and Seaborg collaborated to create an important isotope of iodine, iodine-131 (I-131) which is still used to treat thyroid disease. (Many years later, it was credited with prolonging the life of Seaborg's mother.)

All you wanted to know about Compressed Sensing

Because too often the subject of compressed sensing is confused with signal compression performed in transform coding, I created a list of site from which Google can search.










Add to Google

Wednesday, March 14, 2007

Implementing Compressed Sensing in Applied Projects


We are contemplating using Compressed Sensing in three different projects:





  • The Hyper-GeoCam project: This is a payload that will be flown on the HASP platform in September. Last year, we flew a simple camera that eventually produced a 105 km panorama of New Mexico. We reapplied for the same program and have been given the OK for two payloads. The same GeoCam will be re-flown so that we can produce a breath taking panorama from 36 km altitude. The second payload is essentially supposed to be a hyperspectral imager on the cheap: i.e. a camera and some diffraction gratings allowing a fine decomposition of the reflected sun light from the ground. The project is called Hyper-GeoCam and I expect to implement a random lens imager such as the one produced at MIT. Tests will be performed on the SOLAR platform.
  • The DARPA Urban Challenge: We have a car selected in the track B: We do not have Lidars and need to find ways to navigate in an urban settings with little GPS availability. The autonomous car is supposed to be navigating in a mock town and follow the rules of the California traffic laws, that includes passing other cars.
  • Solving the Linear Boltzmann equation using compressed sensing techniques:The idea is that this equation has a known suite of eigenfunctions (called Case eigenfunctions) and because they are very difficult to use and expand from, it might be worth a try to look into the compressed sensing approach to see if it solves the problem more efficiently.

Tuesday, March 13, 2007

Compressive Sensing as a way to solve Integral Equations ?


While reading Compressive Radar Imaging by Richard Baraniuk and Philippe Steeghs, I realized that something very different is happening with Compressed Sensing. I mentioned earlier that in the past, the solving of integral equations implies the use of weak formulations in order to obtain a moment based description of that integral equation. Hence, the general consensus with using an L_2 norm approach (or a least square reduction of the approximation error) is to use trial functions that are the same of the basis functions (in neutron transport, the situation is in fact a little different but nobody knows really why. ) This fact leads most researchers into projecting kernels onto sets of functions for which they become sparse like the wavelets. In other words, there is an expectation that the projected kernel will be very sparse and that matrix-vector operations will be reduced so that the overall method becomes very competitive with moment based method using non-sparse kernels. However, when the kernel reflects a geometry, it is difficult to know in advance which of the component of that kernel will have to be retained and one is therefore led to the nonlinear operation of computing them all first and then getting rid of the ones that are too small. This is a little bit what compressed sensing is avoiding when sampling a signal.
As it turns out, the integral equation that is being solved by a CS-Radar is different, the trial functions are specifically tailored to be incoherent with the basis functions. The big question would be to figure out how to do several iterations instead of just one. In neutron transport, we call some of these methods "source iteration". If we are to push the parallel further, on the one hand the Green's function is generally a description of the medium in which the neutron lives whereas the source is really the neutron distribution, hence there is no reason to believe that both types of functions should come from the same family and bear the same decomposition.

The neutron example goes further. Emmanuel Candes made a presentation last year on compressed sensing and showcased an example using neutron radiography at NIST on a fuel cell. It turns out, this is an experiment I did a long time ago at the Nuclear Science Center. The main issue being that in order to see exactly what is going on inside the fuell cell (which has a lot of graphite and copper), neutrons are the only way to see through and detect water or water vapor at the same time.
The reason for this type of investigation is simple: If water gets stuck at one place in the groves of the fuel cell, it will stop the fuel cell from working. It is therefore important to know exactly the hydrodynamics of the water in the cell while the cell is working. What is interesting is that the neutron scatters when in presence of water, so that the color on the graph of his presentation represents roughly the density of water in the cell. The neutron beams are used as in CT tomography with the fuel cell being hit from different angles. The compressed sensing formalism allows for fewer number of angles. What is interesting though is that many neutrons are deflected or diffused and therefore the problem at hand cannot be rigorously be seen as a CT scan. It is good enough for observation, but it would not be appropriate for neutron flux measurements for instance. The scattering issue should be solved using an integral equation called the linear Boltzmann equation!

Printfriendly