Thursday, February 18, 2010

CS: Crossing the Donoho-Tanner Border

I was having a discussion the other day about how one could apply compressive sensing to a specific problem featuring the sparse linear black box discovery. The sparse linear black box discovery problem is when one wants to find all the coefficients of a matrix that is known to be sparse (but not rank deficient nor do we know the sparsity pattern) . One also hope that this can be done with as few measurements as possible. We all know that a trivial solution is to evaluate this matrix through a projection of the canonical basis in N steps, where N is the number of columns of the matrix. But in the case when the matrix is known to be sparse, the number of measurements can be simply decreased in k*log(N) with high probability. See the MMA17 post and the attendant algorithm on how to do this.

Anyway, everything seems fine but sometimes you hit a rock: The matrix being sought is not highly sparse or compressible and one wonders what to do next. The obvious solution would be to go back to performing the N measurements as this is a safe bet and will work all the time. One could also, as I jokingly suggested, increase the size of the problem so that something that was barely sparse in one instance becomes highly sparse in the new setting. However this suggestion is only half of the story as can be seen in the very important paper of Jared Tanner and David Donoho entitled Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing. In the paper, the authors show their discovery of a phase diagram that seems universal in that it fits many somewhat unrelated high dimensional problems. Perhaps the most important figure of interest to us is Figure 3 of that paper. It is included below. I just added five numbers to it in order to provide some context. Let us say that we have a not so sparse matrix to fill in. In the figure, this starting point is labeled 1 with coordinates delta at .52 and rho at .84. In other words, the sparsity of the matrix is a high 84% while the undersampling is 52% (that means that we would sample the linear black box with only 0.52N measurements). It's in the red, meaning that the use of l_1 solvers won't allow us to build a matrix with .52N measurements.

What happens when you increase the size of your problem i.e. increase N as I suggested? You go from 1 to 2...not a good path, you really are aiming to go blue so that an l_1 solver can really find that matrix using the algorithm featured on Monday last week. The only way you can go to 3 is by increasing both the size of your problem N and the amount of measurements n while hoping that the physics of the problem k will remain the same. That way n/N stays constant, while k/n decreases. You could also go to 4 with a path between 2 and 3, there, the increase in the number of measurements is not as linear as the number of measurements for 3. The element to worry about is whether the new number of measurements is as large as the older problem size N. Finally, let us note that the area highlighted by 5 is really the canonical construction (n=N), i.e. reconstruction works all the time and one doesn't care if the matrix is sparse.

Credit: NASA, The Cupola module was just installed yesterday on the International Space Station.

No comments: