Monday, November 12, 2007

Monday Morning Algorithm Part deux: Reweighted Lp for Non-convex Compressed Sensing Reconstruction

Today's algorithm implements the algorithm found in Iteratively Reweighted Algorithms for Compressive Sensing by Rick Chartrand and Wotao Yin as mentioned in an earlier post. The cons for the algorithm is that it solves a non-convex problem and therefore there is no provable way of showing you achieved an optimal solution. The pros are:
  • you don't need semi-definite programming
  • you do better than the reweighted L1 technique and
  • you do it faster
  • all operations rely on simple linear algebra which can then be optimized for the problem at hand.
The algorithm should be usable on Matlab and Octave. If you want to contribute a small algorithm next week or any week, please let me know. So here is the beast:

clear
% Algorithm implementing an approach suggested in
% "Iteratively Reweighted Algorithms for Compressive Sensing"
% by Rick Chartrand and Wotao Yin
%
% n is dimension of the ambient space (number of signal samples)
% m is the number of Compressed Sensing measurements
% p is the defining the order of the metric p=1 is l1 norm....
n= 100;
m = 40;
p = 0.9;
% Setting up the problem where one produces
% an input for the problem to be solved
phi = rand(m,n);
u_true = [1 10 120 90 950 4 100 1000 20 30 zeros(1,n-10)]';
u_noise = 10^(-4)* rand(n,1);
u_true = u_true + u_noise;
b = phi * u_true;
%
% solving for phi * u = b , where u is now the
%unknown to be found by the IRlp
%
epsilon = 1
% u_0 is the L_2 solution which would be exact if m = n,
% but here m is less than n
u_0 = phi\b;
u_old = u_0;
j=0
while epsilon > 10^(-8)
j = j + 1
w=(u_old.^(2) + epsilon).^(p/2-1);
v=1./w;
Q_n=diag(v,0);
tu=inv(phi*Q_n*phi');
u_new = Q_n * phi' * tu * b;
if lt(norm(u_new - u_old,2),epsilon^(1/2)/100)
epsilon = epsilon /10;
end
u_old = u_new;
end
% One plot showing the superposition of the least-square
% solution u_0 (symbol v), the
% reweighted Lp solution (symbol *) and the
% true solution (symbol o)
plot(u_0,'v')
title('Least-squares u_0 (v), reweighted Lp (*)...
, initial signal symbol (o)')
hold
plot(u_new,'*')
plot(u_true,'o')


As usual if you find a mistake or something interesting after playing with it, please let me know. Also, I you have a second and half, please answer these two questions, thanks in advance.








Credit Photo: NASA/JPL/Space Science Institute, Ring Tableau, Saturn's Moon: Mimas, Epimetheus, Daphnis

13 comments:

Mehrdad said...

A problem with the re-weighting l_2 is that it does not turn coefficients to zero "exactly" (with limited number of iterations). If you zoom on the coefficients that should be zero, it is obvious. But in the re-weighting l_1 we would not face this problem (at least in the noise-less case).
Another problem which I faced with you implementation using small p (for example .5) is that it does not converge in reasonable number of iteration.

Igor said...

Mehrdad,

So it does work on somebody else computer. Good.

I agree with both of your comments. For the small p, I have noticed that as well and I believe this is the reason why people go for p=0.9 first and then go down. If it doesn't converge, then one should think of using a lower p. The lower p brings you the ability to recover a signal that is not so sparse and it may be worth it to wait a little longer.

One should note that if it doesn't converge with the L1 or reweighted L1, one is left with little recourse. In the reweighted Lp case, at least p can be varied.

Cheers,

Igor.

Mehrdad said...

I agree with you that l_p based algorithms could be better decoding methods in CS. But I just wanted to point out a difference between l_2 and l_1 re-weighting algorithms.

Cheers
-M

Igor said...

Mehrdad,

Thank you for pointing it out.


Igor.

Jort said...

Well, it works for me too ;)

And actually, trying different sparse solvers on my (noisy) this routine gives far sparser (and better) solutions. I also have success using p=0. I use this algorithm now in combination with the residual calculation described in the face recognition paper mentioned earlier on this blog. Anyone else working on that?

Igor said...

Jort,

That makes three people, woohoo :-)

Since you have compared it with other reconstruction methods, what about speed: have you a seen a significant difference ? and how large is your sample dimension (order of magnitude) ?

Cheers,

Igor.

Jort said...

In terms of slowliness :)

1) reweighted Lp
2) bb_mono_cont (GPSR or so)
3) cvx
3) spgl1
4) l1magic (qc)
5) sparsify

This is not an exact list obviously since my goal was not speed testing. Furthermore, for smaller problem sizes the order changes (reweighted doesnt perform as bad anymore for example).

I did problem sizes up to 320x30000 . I had to make the w coefficient in your algorithm sparse so diag Q_n became sparse for the big problems. Other than that no out-of-memory problems.

Igor said...

Jort,

Thanks for the information. I like the slow part, that is an initial component of a disruptive technology ( see :
http://nuit-blanche.blogspot.com/2007/08/compressed-sensing-why-does-rice-play.html ). Since most of the time is spent on the inversion, I am betting that can be improved tremendously since it relies on general LAPACK improvements. Since most people design their hardware to make lapack faster, it seems to be a good bet for the future especially if it can deliver sparser solutions.

BTW, it is Rick Chartrand and Wotao Yin's algorithm, I have just made a naive implementation of it.

I am surprised at the p=0 result though :-)

Igor.

Jort said...

I never considered it that way..but how true, fact is no-one in speech recognition (my field) seems to working on this (or CS in general).

I second your analysis of the speed problems. I was considering rewriting the algorithm to a mex file but that will have to wait untill I have shown the usefullness of CS in classification...

I am aware of their paper on the algorithm: you seem to have followed their work to the letter. Their work also shows good results for p=0..why the surprise? Do I misintrepret something?

Igor said...

The surprise is really a more generic statement. The reason the basis pursuit L1 result were so novel was because L0 was considered to be combinatorial in general. So now, we use a least square (how uncool :-)) - lp technique with p=0 and it doesn't require the "combinatorial" amount of work advertized in the first place. So this is the surprise. The method seems to work for p=0 and for a larger class of signals than initially thought.

Cheers,

Igor.

Jort said...

True.

On a different note: Did you try the face recognition algorithm described in "Feature selection in face recognition: A sparse representation perspective."?

I am wondering about my implementation of their work. If you'd like, could I email you about this. Maybe you want to post my code as a monday-morning code if it checks out.

Igor said...

Jort,

No, I have not had time to try the face recognition algorithm.

Yes, I would be interested in checking it out and would be glad posting it on the next monday morning code.

Igor.

Anonymous said...

+1 for works!
Thanks

Printfriendly