Friday, October 10, 2008

CS: Unanswered questions in CS, Accelerated Dense Random Projections, Apprenticeship Learning Using LP, "Real" Slepian-Wolf Codes

A summary of a talk at the SLIM lab on compressive sensing can be found in a blog entry entitled: Unanswered questions in compressed sensing

Edo Liberty has submitted his Ph.D. thesis entitled: Accelerated Dense Random Projections. The slides of the talk are here. When we met at Texas A&M, I asked Justin Romberg about the relationship between the random projections used in CS and those appearing from the Johnson-Lindenstrauss Lemma. He proceeded to give an explanation similar to the one he wrote in one of his paper entitled: Compressive sensing by random convolution where in page 7 of that paper, we can read:
In [1], Ailon and Chazelle propose the idea of a randomized Fourier transform followed by a random projection as a "fast Johnson-Lindenstrauss transform" (FJLT). The transform is decomposed as QF\Sigma, where Q is a sparse matrix with non-zero entries whose locations and values are chosen at random locations. They show that this matrix QF\Sigma behaves like a random waveform matrix in that with extremely high probability, it will not change the norm of an arbitrary vector too much. However, this construction requires that the number of non-zero entries in each row of Q is commensurate with the number of rows m of Q. Although ideal for dimensionality reduction of small point sets, this type of subsampling does not translate well to compressive sampling, as it would require us to randomly combine on the order of m samples of Hx_0 from arbitrary locations to form a single measurement -- taking m measurements would require on the order of m^2 samples. We show here that more structured projections, consisting either of one randomly chosen sample per row or a random combination of consecutive samples, are adequate for CS. This is in spite of the fact that our construction results in much weaker concentration than the FJLT.
That cleared it up but I think I needed to see it in writing :).

There is also a larger version of Compressive Structured Light for Recovering Inhomogeneous Participating Media, by Jinwei Gu, Shree Nayar, Eitan Grinspun, Peter Belhumeur, and Ravi Ramamoorthi

As some of you know, I am interested in policy learning, it is happens that Umar Syed, Robert Schapire, and Michael Bowling have just released Apprenticeship Learning Using Linear Programming. (web page). The abstract reads:
In apprenticeship learning, the goal is to learn a policy in a Markov decision process that is at least as good as a policy demonstrated by an expert. The difficulty arises in that the MDP's true reward function is assumed to be unknown. We show how to frame apprenticeship learning as a linear programming problem, and show that using an off-the-shelf LP solver to solve this problem results in a substantial improvement in running time over existing methods -- up to two orders of magnitude faster in our experiments. Additionally, our approach produces stationary policies, while all existing methods for apprenticeship learning output policies that are "mixed", i.e. randomized combinations of stationary policies. The technique used is general enough to convert any mixed policy to a stationary policy.
The video of the ICML presentation on the same subject is here.

Found on arxiv, two papers:

"Real" Slepian-Wolf Codes by Bikash Kumar Dey, Sidharth Jaggi, Michael Langberg. The abstract reads:
We provide a novel achievability proof of the Slepian-Wolf theorem for i.i.d. sources over finite alphabets. We demonstrate that random codes that are linear over the real field achieve the classical Slepian-Wolf rate-region. For finite alphabets we show that typicality decoding is equivalent to solving an integer program. Minimum entropy decoding is also shown to achieve exponentially small probability of error. The techniques used may be of independent interest for code design for a wide class of information theory problems, and for the field of compressed sensing.

Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models by Matthias W. Seeger and Hannes Nickisch. The abstract reads
We provide novel approximate Bayesian inference algorithms for sparse generalized linear models, that can be used with hundred thousands of variables, and run orders of magnitude faster than previous algorithms in domains where either apply. By analyzing our methods and establishing some novel convexity results, we settle a long-standing open question about variational Bayesian inference for continuous variable models: the Gaussian lower bound relaxation, which has been used previously for a range of models, is proved to be a convex optimization problem, if and only if the posterior mode is found by convex programming. Our algorithms reduce to the same computational primitives than commonly used sparse estimation methods do, but require Gaussian marginal variance estimation as well.
We are interested in Bayesian experimental design here (which is mainly driven by efficient approximate inference), a powerful framework for optimizing measurement architectures of complex signals, such as natural images. Designs optimized by our Bayesian framework strongly outperform choices advocated by compressed sensing theory, and with our novel algorithms, we can scale it up to full-size images. Immediate applications of our method lie in digital photography and medical imaging.

Photo Credit: EUMETSAT. Meteosat-8 Rapid Scan captures asteroid impact, On 7 October, asteroid 2008 TC3 hit Earth and exploded in the atmosphere over northern Sudan. Amazingly, the Meteosat-8 Rapid Scanning Service managed to capture the impact.


Anonymous said...

Regarding to your question of the relationship between the random projections used in CS and those appearing from the Johnson-Lindenstrauss Lemma, the paper in the following link might be a partial answer for that question:

Thong Do, JHU

Igor said...


Thanks for the pointer. I'll feature it next week.



Anonymous said...

I have a simple question about compressive sampling, and I don't know where to ask it. Perhaps someone in the audience here can enlighten me.

Suppose I have an image, and I want to sample a certain fraction of the pixels, and then use these sampled pixels to recover an estimate of the original image.

The simplest thing to do would be to sample uniformly, then just use bilinear interpolation to recover an estimate of the image.

Another option is to sample randomly, then use compressive sampling (say, the SPGL1 algorithm, and using the DCT for compression) in order to recover an estimate of the original image.

Question: Which method will give you better results? Have you tried it? Because I have tried it and interpolation is winning. Does this mean I'm doing something wrong?

Another question: What fraction of the pixels would you typically need to sample in order to get a recovery (using compressive sampling) that looks pretty good.

Sorry if this is not a good place for such a question. And thanks to anyone who can give me a quick answer!

Igor said...

Dear Anonymous,

In Compressive sampling, the sampling is not performed at random, rather all the pixels are being used. What is called a Compressed Sensing measurement is a linear combination of all these pixels weighted with pseudo-random numbers. The number of linear combinations is much less than the full amount of pixels. The reconstruction techniques used in compressive sensing take these CS measurement and reproduce the initial signal exactly. There is no linear interpolation.

If you have access to matlab you might want to look at the code in the following entry (it features images):

or the examples in the sparco package:

Let me know if it helped.