Saturday, October 13, 2012

Sunday Morning Insight: The Linear Boltzmann Equation and Co-Sparsity

This past week, we have had several examples of compressive sensing that used dictionary learning as a means of providing better reconstructions [1,2,3]. This is interesting because it gets close to the realm of real applications. Out of the consequences of these studies, there is the nagging feeling that somehow, the dictionary may not be large enough to extrapolate these findings to other scenes (here patients) and so forth.

One of the issue that ought to be relevant in this discussion on the extrapolation of these results should revolve around the synthesis versus analysis based reconstruction.

As you probably recall, both the synthesis and the analysis based reconstructions yield the same results when the dictionary is a square matrix but provide a different insight in the case of overcomplete dictionaries. 

A Detour: The Linear Boltzmann Equation 

Some people don't seem to clearly picture what the analysis model/reconstruction means. From my limited understanding, let me try to shine some light in that dark corner. But first, let me recount a story. Back in the day, I used to be interested in the linear transport equation also known as the Linear Boltzmann Equation as it is the law of the land for neutrons. Neutron population are used in nuclear reactor cores to enable the criticality of said reactors. In order to  continuously produce thermal energy through the splitting of uranium and plutonium atoms, you have to know neutron population pretty well.

It turns out that the people using this equation are freaks: They want schemes that can bring 10E-10 accuracy because fluxes are of the order of 10E11 to 10E16 particles per seconds. A few of these lost neutrons ( the product of the flux with the accuracy) are not good for your health. In short, since the 1940's a dedicated slew of people have been trying different approaches to solve that equation with a precision that is unheard of in other areas of physics and engineering. All this in the name of shielding us from harmful radiation. This is all good, but why are we having this discussion ? Well, it turns out that the freaks have looked at the Linear Boltzmann Equation through **many** angles. One of these angles is to decompose the flux in some eigenfunctions of the Linear Transport Operator (they are in fact distributions). There are some completude results for 1-D but 3-D is still really unexplored and not really well understood. Anyway, what was noticed empirically is that fluxes always end up following a diffusion equation away from the boundaries ( a few mean free paths). Close to the boundaries, the full transport solution needs those eigendistributions expansions. While you may not care about neutrons, this same equation actually applies to electrons, neutrons and photons modulo differing scattering kernels. 

The crucial insight out of this is the following: Photon, electron, neutron distributions follow a diffusive behavior in constant media except close to the boundaries where the behavior follows a more complex transport behavior. If your medium is composed of say 15 layers of different media, then the field of photons/electrons/neutrons in that medium will follow different diffusion equations in between the boundaries of these layers and will follow a more complex transport solution very near and at the boundaries of these layers.

Back to the analysis approach

When you are solving an analysis based reconstruction with the TV norm, you are in effect solving the following problem: 

What is the field function that has no gradients except for a few spikes, that is also consistent with the following measurements of that field at different locations Ax = b. 

This is all good and fun when the flux does not change over distance in non diffusive media (like photons in air for photography) However, this is just an approximation. Light fields as well as neutron fluxes do change over distance. As mentioned earlier, they mostly follow a diffusion equation most of the time except at the boundaries.

When you learn the analysis operator, instead of postulating the TV norm, you are really asking the following question: 

Given all these data, what operator is consistent with all these measurements ? 

If this is a diffusion operator that regulates the field, then the analysis approach provides you with the coefficients of that operator: In short, it provides you with the diffusion coefficients, effectively solving an inverse diffusion problem. 

When you use that new found dictionary and solve a compressive sensing problem, you are in effect asking  a new question:

Given this new found diffuson operator what is the field that is consistent with this new found diffusion operator and is also consistent with the measurements Ax = b (probably performed at some -not all- boundaries).

The result of the reconstruction using this new learned dictionary will provide a field that is mostly a diffusion solution with some spiky results that fulfill the attendant boundary condition that connects these fluxes across diverse interfaces. The "spikes" will in effect correspond to the non diffusive part of the Linear Boltzmann Equation near the boundary. 

Using the notation of Miki Elad's presentation entitled: K-SVD Dictionary-Learning for Analysis Sparse Models

The \Omega operator is a diffusion operator with either a learned diffusion coefficient or a synthetic one. x is the field that is sought after, while the non zeros of z represents the locations where the field does not adhere to the diffusion equation, i.e. fulfill the full transport equation. I wonder if there are many low hanging fruits here. Additional co-sparsity results can be found here

Tomorrow, there will be an interesting meeting in Rennes where some of these issues on co-sparsity will be covered. and some related thoughts can be found in A Comment on Learning Analysis Operators

No comments: