Tuesday, November 10, 2009

CS: Xampling, Compressive Wide-Band Spectrum Sensing, An unconstrained lq minimization, Random matrices are democratic

From the folks who brought us the Modulated Wideband Converter hardware, here is the design methodology underneath it called Xampling which in effect deals with Compressive Sensing for Analog signals: Xampling -- Part I: Practice by Moshe Mishali, Yonina Eldar and Asaf Elron . The abstract reads:

We introduce Xampling, a design methodology for sub-Nyquist sampling of continuous-time analog signals. The main principles underlying this framework are the ability to capture a broad signal model, low sampling rate, efficient analog and digital implementation and lowrate baseband processing. The main hypothesis of Xampling is that in order to break through the Nyquist barrier, one has to combine classic methods and results from sampling theory together with recent developments from the literature of compressed sensing. In this paper, we present the Xampling framework and examine several sub-Nyquist approaches in light of the four Xampling principles. It is shown that previous methods suffer from analog implementation issues, large computational loads in the digital domain, and have no baseband processing capabilities. An exception is the recently proposed modulated wideband converter (MWC) which satisfies the model, rate and implementation criteria, though lacking the baseband processing capability. Here, we extend the MWC by proposing a digital algorithm which extracts each band of the signal from the compressed measurements, thus enabling lowrate (baseband) processing. The converter with the proposed algorithm conforms with the Xampling desiderata. In addition, we describe two configurations of the converter for efficient spectrum sensing in wideband cognitive radio receivers. In the second part of this work we study theoretical aspects of rate and stability of sub-Nyquist systems, following the pragmatic theme of the Xampling methodology.

At the end of the paper, the author uses the example of the cognitive Radio which eventually led me to the following two papers and thesis: Distributed Compressive Wide-Band Spectrum Sensing by Ying Wang, Ashish Pandharipande, Yvan Lamelas Polo and Geert Leusy. The abstract reads:

—We consider a compressive wide-band spectrum sensing scheme for cognitive radio networks. Each cognitive radio (CR) sensing receiver transforms the received analog signal from the licensed system in to a digital signal using an analog-to-information converter. The autocorrelation of the compressed signal is then collected from each CR at a fusion center. A compressive sampling recovery algorithm that exploits joint sparsity is then employed to reconstruct an estimate of the signal spectrum and used to make a decision on signal occupancy. We compare the performance of this distributed compressive spectrum sensing scheme with a compressive spectrum sensing scheme at a single CR and show the performance gains obtained from spatial diversity.

and Compressive Wide-Band Spectrum Sensing by Ying Wang, Ashish Pandharipande, Yvan Lamelas Polo and Geert Leusy. The abstract reads:

We present a compressive wide-band spectrum sensing scheme for cognitive radios. The received analog signal at the cognitive radio sensing receiver is transformed in to a digital signal using an analog-to-information converter. The autocorrelation of this compressed signal is then used to reconstruct an estimate of the signal spectrum. We evaluate the performance of this scheme in terms of the mean squared error of the power spectrum density estimate and the probability of detecting signal occupancy.

And finally, to Yvan Lamelas Polo's thesis entitled: Compressive Wideband Spectrum Sensing for Cognitive Radio Applications with the following abstract:

It has been widely recognized that utilization of radio spectrum by licensed wireless systems, e.g., TV broadcasting, aeronautical telemetry, is quite low. In particular, at any given time and spatial region, there are frequency bands where there is no signal occupancy. There has been recent interest in improving spectrum utilization by permitting secondary usage using cognitive radios. Cognitive radios use spectrum sensing to determine frequency bands that are vacant of licensed signal transmissions and transmit on such portions to meet regulatory constraints of avoiding harmful interference to licensed systems. Future cognitive radios will be capable of scanning a wide band of frequencies, in the order of a few GHz, and employ adaptive waveforms for transmission depending on the estimated spectrum of licensed systems. In this thesis, we address the problem of estimating the spectrum of the wide-band signal received at the cognitive radio sensing receiver using compressive sampling coupled with a multiband spectrum detector to determine the spectrum occupancy of the licensed system. Since individual cognitive radios might not be able to reliably detect weak primary signals due to channel fading/shadowing, we also propose a distributed compressive scheme based on joint recovery of the license occupancy for application scenarios involving geographically distributed radios. In such a distributed approach, the spectrum occupancy is determined by the joint work of cognitive radios (exploiting spatial diversity), as opposed to being determined individually by each cognitive radio.


In a different direction, here is another intriguing paper on a reweighted algorithm dealing with l_q problems: An unconstrained lq minimization for sparse solution of under determined linear systems by Ming-Jun Lai and Jingyue Wang. The abstract reads:

We study an unconstrained version of the ℓq minimization for the sparse solution of under-determined linear systems for 0 \lt q \le 1. Although the minimization is nonconvex, we introduce a regularization and develop an iterative algorithm. We show that the iterative solutions converge to the sparse solution. Numerical experiments will be demonstrated to show that our approach works very well.
I note the following from their paper:
According to our theory, we can completely determine a sparse solution without any conditions, e.g., the restricted isometry property (RIP) on matrix A. This is an another advantage of the ℓq minimization for 0 \lt q \lt 1 over the classic ℓ1 minimization approach.



Their algorithm looks like a reweighted scheme and as one can read from the paper, it takes a long time to converge, much like the IRLS scheme. Finally, let us note the success of this algorithm with uniform random matrices.


And finally from the Rice Compressive Sensing Repository we have:

The recently introduced theory of compressive sensing (CS) enables the reconstruction of sparse or compressible signals from a small set of nonadaptive, linear measurements. If properly chosen, the number of measurements can be significantly smaller than the ambient dimension of the signal and yet preserve the significant signal information. Interestingly, it can be shown that random measurement schemes provide a near-optimal encoding in terms of the required number of measurements. In this report, we explore another relatively unexplored, though often alluded to, advantage of using random matrices to acquire CS measurements. Specifically, we show that random matrices are democractic, meaning that each measurement carries roughly the same amount of signal information. We demonstrate that by slightly increasing the number of measurements, the system is robust to the loss of a small number of arbitrary measurements. In addition, we draw connections to oversampling and demonstrate stability from the loss of significantly more measurements.


Monday, November 09, 2009

CS: Compressive Confocal Microscope, Compressive Matched Subspace Detection, Music Classification, Matrix Completion, Metric Learning


Before we get to the meat of today's entry, let me invite y'all to use the customized search engine on the right hand side of this blog to search for entries relevant to your area of interest. The search feature on the top left corner of this blog is buggy at best and does not do a good job of finding relevant entries when you are using only one word. Also, of related interest, I am seeking a Google Wave invite as I am interested to see if one could develop some compressive sensing hardware with some folks of the open hardware community. Irrespective to your ability to send me an Google Wave invite, you are welcome to join that experiment. [Hint to those of you working at Google, I just made a request for an invite]



It seems that I overlooked the work being done by the group of Gonzalo Arce at University of Delaware. I hope to remedy this today. From his group, we have a new compressive sensing hardware, a music classifier, and other relevant studies in between. The hardware first as exposed in : Compressive Confocal Microscopy by Peng Ye, Jose Paredes, Gonzalo Arce, Yichuan Hu , C. Chen, and Dennis Prather.The abstract reads:

In this paper, a new framework for confocal microscopy based on the novel theory of compressive sensing is proposed. Unlike wide field microscopy or conventional parallel beam confocal imaging systems that use charge-coupled devices (CCD) as acquisition devices in addition to complex mechanical scanning system, the proposed compressive confocal microscopy is a kind of parallel beam confocal imaging system which exploits the rich theory of compressive sensing by using a single pixel detector and a digital micromirror device (DMD) to capture linear projections of the in-focus image. With the proposed system, confocal imaging of high optical sectioning ability can be achieved at sub-Nyquist sampling rates. Theoretical analysis, simulations and experimental results are shown to demonstrate the characteristics and potential of the proposed approach.



and the attendant reconstruction paper: Compressive Confocal Microscopy: 3D reconstruction algorithms by Peng Ye, Jose Paredes, Yichuan Hu , C. Chen, Gonzalo Arce and Dennis Prather. The abstract reads:
In this paper, a new approach for Confocal Microscopy (CM) based on the framework of compressive sensing is developed. In the proposed approach, a point illumination and a random set of pinholes are used to eliminate out-of-focus information at the detector. Furthermore, a Digital Micromirror Device (DMD) is used to efficiently scan the 2D or 3D specimen but, unlike the conventional CM that uses CCD detectors, the measured data in the proposed compressive confocal microscopy (CCM) emerge from random sets of pinhole illuminated pixels in the specimen that are linearly combined (projected) and measured by a single photon detector. Compared to conventional CM or programmable array microscopy (PAM), the number of measurements needed for nearly perfect reconstruction in CCM is significantly reduced. Our experimental results are based on a testbed that uses a Texas Instruments DMD (an array of 1024£768; 13:68£13:68 ¹m2 mirrors) for computing the linear projections of illuminated pixels and a single photon detector is used to obtain the compressive sensing measurement. The position of each element in the DMD is defined by the compressed sensing measurement matrices. Three dimensional image reconstruction algorithms are developed that exploit the inter-slice spatial image correlation as well as the correlation between different 2D slices. A comprehensive performance comparison between several binary projection patterns is shown. Experimental and simulation results are provided to illustrate the features of the proposed systems.


We propose a system based on the combination of compressive sensing and non-linear processing that shows excellent robustness against noise. The key idea is the use of nonlinear mappings that act as analog joint source-channel encoders, processing the compressive sensing measurements proceeding from an analog source and producing continuous amplitude samples that are transmitted directly through the noisy channel. As we will show in our simulation results, the proposed framework is readily applicable in practical systems such as imaging, and clearly outperforms systems based on stand-alone compressive sensing.

and the next two papers focus in the adaptive detection of signals of interest while in the compressed measurement world: Compressive Matched Subspace Detection by Jose Paredes, Zhongmin Wang, Gonzalo Arce and Brian M. Sadler. The abstract reads:

In this paper, matched subspace detectors based on the framework of Compressive Sensing (CS) are developed. The proposed approach, called compressive matched subspace detectors, exploits the sparsity model of the signal-of-interest in the design of the random projection operator. By tailoring the CS measurement matrix (projection operator) to the subspace where the signal-of-interest is known to lie, the compressive matched subspace detectors can effectively capture the signal energy while the interference and noise effects are mitigated at sub-Nyquist rate. The proposed detection approach is particularly suitable for detection of wideband signals that emerge in modern communication systems that demand high-speed ADCs. The performance of the subspace compressive detectors are studied by analytically deriving closed-form expressions for the detection probability and through extensive simulations.
Compressed sensing (CS) provides an efficient way to acquire and reconstruct natural images from a reduced number of linear projection measurements at sub-Nyquist sampling rates. A key to the success of CS is the design of the measurement ensemble. This paper addresses the design of a novel variable density sampling strategy, where the “a priori” information about the statistical distributions that natural images exhibit in the wavelet domain is exploited. Compared to the current sampling schemes for compressed image sampling, the proposed variable density sampling has the following advantages: 1) The number of necessary measurements for image reconstruction is reduced; 2) The proposed sampling approach can be applied to several transform domains leading to simple implementations. In particular, the proposed method is applied to the compressed sampling in the 2D ordered discrete Hadamard transform (DHT) domain for spatial domain imaging. Furthermore, to evaluate the incoherence of different sampling schemes, a new metric that incorporates the “a priori” information is also introduced. Extensive simulations show the effectiveness of the proposed sampling methods.

Finally, a continuation of the concepts developed in the classification on faces, shows that music is not the same as faces: Music Genre Classification via Sparse Representation of Auditory Temporal Modulations by Yannis Panagakis, Constantine Kotropoulos, Gonzalo Arce. The abstract reads:
A robust music genre classification framework is proposed that combines the rich, psycho-physiologically grounded properties of slow temporal modulations of music recordings and the power of sparse representation-based classifiers. Linear subspace dimensionality reduction techniques are shown to play a crucial role within the framework under study. The proposed method yields a music genre classification accuracy of 91% and 93.56% on the GTZAN and ISMIR2004 Genre dataset, respectively. Both accuracies outperform any reported accuracy ever obtained by state of the art music genre classification algorithms in the aforementioned datasets.

Finally, in a totally unrelated field, we have: Matrix Completion from Power-Law Distributed Samples by Raghu Meka, Prateek Jain, and Inderjit Dhillon. The abstract reads:
The low-rank matrix completion problem is a fundamental problem with many important applications. Recently, [4],[13] and [5] obtained the first non-trivial theoretical results for the problem assuming that the observed entries are sampled uniformly at random. Unfortunately, most real-world datasets do not satisfy this assumption, but instead exhibit power-law distributed samples. In this paper, we propose a graph theoretic approach to matrix completion that solves the problem
for more realistic sampling models. Our method is simpler to analyze than previous methods with the analysis reducing to computing the threshold for complete cascades in random graphs, a problem of independent interest. By analyzing the graph theoretic problem, we show that our method achieves exact recovery when the observed entries are sampled from the Chung-Lu-Vu model, which can generate power-law distributed graphs. We also hypothesize that our algorithm solves the matrix completion problem from an optimal number of entries for the popular preferential attachment model and provide strong empirical evidence for the claim. Furthermore, our method is easy to implement and is substantially faster than existing methods. We demonstrate the effectiveness of our method on random instances where the low-rank matrix is sampled according to the prevalent random graph models for complex networks and present promising preliminary results on the Netflix challenge dataset.

and from the same group, the important: Metric and Kernel Learning using a Linear Transformation by Prateek Jain, Brian Kulis, Jason Davis and Inderjit Dhillon. The abstract reads:
Metric and kernel learning are important in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over low-dimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new data points. In this paper, we study metric learning as a problem of learning a linear transformation of the input data. We show that for high-dimensional data, a particular framework for learning a linear transformation of the data based on the LogDet divergence can be efficiently kernelized to learn a metric (or equivalently, a kernel function) over an arbitrarily high dimensional space. We further demonstrate that a wide class of convex loss functions for learning linear transformations can similarly be kernelized, thereby considerably expanding the potential applications of metric learning. We demonstrate our learning approach by applying it to large-scale real world problems in computer vision and text mining.


Friday, November 06, 2009

CS: New version of ISD and attendant video tutorials

You probably recall the concept of ISD, a reconstruction solver that uses all measurements available to compute a solution. Well, Wotao Yin just sent me the following:

I put a very clean and simple ISD code at http://www.caam.rice.edu/~optimization/L1/ISD/. It comes with two video tutorials.

Here is extract of what's new in the package:
Version 1.1 (November 3, 2009). [download]
  • This version is much simpler and clearer than ver 1.0 below.
  • Video tutorials: Demo [online] or [download]; Main code [online] or [download].
  • To adapter the code to your data and sparse/compressible signal and for best results, please (i) tune the thresholding methods and parameters, and (ii) consider replacing YALL1 by one designed for your data. The technical report [pdf] describes ideas of effective thresholding.
  • Despite the small problems given in the demos, the code is capable of solving large-scale, multi-dimensional problems. Ver 1.0 below includes large-scale tests. The allowed size of the problem is merely subject to the limit posted by the subproblem solver used.
  • The main code can be extended to deal with multi-dimensional signals in a straightforward way.

Enjoy !

Thursday, November 05, 2009

The Nuit Blanche Effect





When was the last time you knew for a fact that more than 120 people had read the abstract of your paper within a week of releasing it ?





Back in May 2008 at the Nonlinear L1 meeting at Texas A&M University, I suddenly realized the power of the blog when Adam Oberman mentioned to me over lunch with Anna Gilbert and Simon Foucart that he knew of the blog (first surprise, we are in College Station, Texas and somebody has heard of the blog :-)) through a friend who had seen a spike of downloads of his paper after being mentioned here (second surprise!). Ever since that moment I have been trying to quantify this effect which, it seems, has grown over time. If you have stories about this or any other stories connected to the blog (you can stay anonymous), I would be very happy to hear them. Let us note that this number is a conservative one as it does not include the people being directly mailed the blog entries (237 as we speak), nor the people coming directly to the website nor people using another feed than the one listed above (412).

One more note, the first technincal blog entry after that meeting pointed out the need to understand better the RIP (Restricted Isometry Property: What is it good for ?), a subject of continuing interest.

Wednesday, November 04, 2009

CS: UWB Through-Wall Imaging Based on Compressive Sensing


I picked this up from Lianlin Li's blog ( translation is here): another UWB radar that can detect moving things through walls: UWB Through-Wall Imaging Based on Compressive Sensing by Qiong Huang, Lele Qu, Bingheng Wu, and Guangyou Fang. The abstract reads:

To achieve high-resolution 2-D images, through-wall imaging (TWI) radar with ultra-wideband and long antenna arrays faces considerable technical challenges such as a prolonged data collection time, a huge amount of data, and a high hardware complexity. This paper presents a novel data acquisition scheme and an imaging algorithm for TWI radar based on compressive sensing (CS), which states that a signal having a sparse representation can be reconstructed from a small number of nonadaptive randomized projections by solving a tractable convex program. Instead of measuring all spatial-frequency data, a few samples, by employing an overcomplete dictionary, are sufficient to obtain reliable target space images even at high noise levels. Preliminary simulated and experimental results show that the proposed algorithm outperforms the conventional delay-and-sum beamforming method even though many fewer CS measurements are used.






I am adding this the Compressive Sensing Hardware page.

Tuesday, November 03, 2009

CS: A Matrix Completion Entry

After Friday's entry on the subject, here some more matrix completion cods and papers:Yi Ma just sent me the following:

You may want to check out our new website dedicated for low-rank matrix recovery and completion at:

http://perception.csl.illinois.edu/matrix-rank/home.html

Matrix recovery (also known as robust PCA) is arguably a more difficult problem that matrix completion and it reveals some nice tradeoff between rank and sparsity. On the website we gather all up to date work on theory, algorithms, and applications (soon). We also provide the code for the fastest algorithms know todate for both matrix recovery and completion.

I note that they list the other matrix completion approaches here.

and then we have: A Simpler Approach to Matrix Completion by Benjamin Recht. The abstract reads:
This paper provides the best bounds to date on the number of randomly sampled entries required to reconstruct an unknown low rank matrix. These results improve on prior work by Candes and Recht, Candes and Tao, and Keshavan, Montanari, and Oh. The reconstruction is accomplished by minimizing the nuclear norm, or sum of the singular values, of the hidden matrix subject to agreement with the provided entries. If the underlying matrix satisfies a certain incoherence condition, then the number of entries required is equal to a quadratic logarithmic factor times the number of parameters in the singular value decomposition. The proof of this assertion is short, self contained, and uses very elementary analysis. The novel techniques herein are based on recent work in quantum information theory.
and Compressed Quantum Process Tomography by Alireza Shabani, R. L. Kosut, Herschel Rabitz. The abstract reads:
The characterization of a decoherence process is among the central challenges in quantum physics. A major difficulty with current quantum process tomography methods is the enormous number of experiments needed to accomplish a tomography task. Here we present a highly efficient method for tomography of a quantum process that has a small number of significant elements. Our method is based on the compressed sensing techniques being used in information theory. In this new method, for a system with Hilbert space dimension n and a process matrix of dimension n^2 x n^2 with sparsity s, the required number of experimental configurations is O(s log n^4). This heralds a logarithmic advantage in contrast to other methods of quantum process tomography. More specifically, for q-qubits with n=2^q, the scaling of resources is O(sq) linear in the product of sparsity and number of qubits.

Monday, November 02, 2009

CS: Real vs. complex null space properties, Non-Parametric Bayesian Dictionary Learning, Coordinate Descent Optimization code




Today, we have two papers and two codes:

Real vs. complex null space properties for sparse vector recovery by Simon Foucart, Remi Gribonval. The abstract reads:
We identify and solve an overlooked problem about the characterization of underdetermined systems of linear equations for which sparse solutions have minimal `1-norm. This characterization is known as the null space property. When the system has real coefficients, sparse solutions can be considered either as real or complex vectors, leading to two seemingly distinct null space properties. We prove that the two properties actually coincide by establishing a link with a problem about convex polygons in the real plane. Incidentally, we also show the equivalence between stable null space properties which account for the stable reconstruction by `1-minimization of vectors that are not exactly sparse.

Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations by Mingyuan Zhou, Haojun Chen, John Paisley, Lu Ren, Guillermo Sapiro and Lawrence Carin. The abstract reads:
Non-parametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this non-parametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stick-breaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be nonstationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other state-of-the-art approaches.
From the BCS webpage:

BPFA image denoising and inpainting: The package includes the inference update equations and Matlab codes for image denoising and inpainting via the non-parametric Bayesian dictionary learning approach.

Download: BPFA.zip (Last Updated: Oct. 30, 2009)


Back in March I mentioned the following paper Coordinate Descent Optimization for $\ell^1$ Minimization with Application to Compressed Sensing; a Greedy Algorithm by Yingying Li and Stanley Osher. The code is now available on Matlab Central.


Credit: OnOrbit, X-prize, Unreasonable rocket.

Friday, October 30, 2009

CS: Recovering low-rank matrices from few coefficients in any basis, Probability of Unique Integer Solution to a System of Linear Equations

Two items first. In yesterday's entry on the Sudoku paper ( Linear Systems, Sparse Solutions, and Sudoku by Prabhu Babu, Kristiaan Pelckmans, Petre Stoica, and Jian Li) it is interesting to find out that a re-weighted scheme seems to be doing better than some game specific greedy algorithm. Furthermore, at the end of the paper, we can read the following:

The presently known sufficient conditions on A for checking the equivalence of P0 and P1, like the restricted isometry property (RIP) [5] and Kashin, Garnaev, Gluskin (KGG) inequality [6], do not hold for Sudoku. So analyzing the equivalence of l0 and l1 norm solutions in the context of Sudoku requires novel conditions for sparse recovery.
I am not sure that RIP of [5] is the condition to check for here, (RIP-2) more likely the sparsity of the measurement matrix implies some type of RIP-1 condition. I also thought KGG was a necessary and sufficient condition albeit an NP-Hard at that. Other conditions can be found on the Compressive Sensing Big Picture page.

In the super-resolution paper mentioned earlier this week, entitled Super-resolution ghost imaging via compressive sampling reconstruction by Wenlin Gong and Shensheng Han, Wenlin confirms to me that only l1_magic was used not GPSR as the reference would tend to indicate. I am sure we are going to see some improvement in the near future. Other reconstruction solvers can be found in the Compressive Sensing Big Picture page too.


David Gross just sent me the following:
...I'm a physicist and new to the field. Originally, some colleagues and me got interested in compressed sensing and matrix completion in the context of quantum state tomography (meaning the experimental characterization of a quantum system). Our paper arXiv:0909.3304 was mentioned on your blog.

The methods we needed to develop in order to translate known results on matrix completion by Candes, Recht, Tao and others to quantum mechanics proved far more general than anticipated. We can now show that a low-rank (n x n)-matrix may be recovered from basically O(n r log^2 n) randomly selected expansion coefficients with respect to any matrix basis. The matrix completion problem is just a special case.

Most importantly, though, the complexity of the proofs was reduced substantially. The spectacular argument by Candes and Tao in arXiv:0903.1476 covered about 40 pages. The new paper implies these results, but is much simpler to digest.

The first version of the paper went onto the arxiv two weeks ago: arXiv:0910.1879. However, it significantly evolved since then. In some cases the bounds are now down to O(n r log n), which -- I'm happy to say -- is tight up to multiplicative factors.

There is an obvious challenge to be met. The O(n r log n)-bounds do not currently extend to the original matrix completion problem. They come tantalizingly close, though, with only one single lemma obstructing a more general result. The precise problem is pointed out at the end of Section III B.

Before submitting the paper, I plan to add a final note. The statements all continue to be true if instead of a matrix basis a "tight frame" (a.k.a. "overcomplete basis") is used.

I should also point out Benjamin Recht's recent and strongly related pre-print arXiv:0910.0651 to you. He independently translated some of the results in our earlier paper on quantum tomography to the matrix completion problem (apparently overlooking our announcement of exactly the same result in the physics paper).

Let us note that the last preprint of Benjamin Recht was recently updated. But first, here is the new revision entitled: Recovering low-rank matrices from few coefficients in any basis by David Gross

We establish novel techniques for analyzing the problem of low-rank matrix recovery. The methods are both considerably simpler, and more general than previous approaches. It is shown that an unknown n × n matrix of rank r can be efficiently reconstructed given knowledge of only O(nr \nu log2 n) randomly sampled expansion coefficients with respect to any given matrix basis. The number \nu quantifies the “degree of incoherence” between the unknown matrix and the basis. Existing work concentrated mostly on the problem of “matrix completion”, where one aims to recover a low-rank matrix from randomly selected matrix elements. Our result covers this situation as a special case. The proof consists of a series of relatively elementary steps, which stands in contrast to the highly involved methods previously employed to obtain comparable results. In cases where bounds had been known before, our estimates are slightly tighter. We discuss operator bases which are incoherent to all low-rank matrices simultaneously. For these bases, we show that O(nr \nu log n) randomly sampled expansion coefficients suffice to recover any low-rank matrix with high probability. The latter bound is tight up to multiplicative factors. This seems to be the first time the optimal order of the log-factor has been proved to be achievable for a matrix reconstruction problem where neither the basis nor the unknown matrix is randomized. This work is an expanded presentation of the recent pre-print [D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert, arxiv:0909.3304] which was aimed at researches working in quantum information theory.
Thanks David !

I also found the following: Probability of Unique Integer Solution to a System of Linear Equations by Olvi Mangasarian and Benjamin Recht. The abstract reads:
We consider a system of m linear equations in n variables Ax = d and give necessary and sufficient conditions for the existence of a unique solution to the system that is integer: x element of {−1,1}^n. We achieve this by reformulating the problem as a linear program and deriving necessary and sufficient conditions for the integer solution to be the unique primal optimal solution. We show that as long as m is larger than n/2, then the linear programming heuristic succeeds for most instances, but if m is less than n/2, the heuristic fails on most instances. We also demonstrate that these predictions match the empirical performance of the linear programming formulation to very high accuracy.


Enjoy!

Credit: NASA / GSFC / ASU, Via The Planetary Society Blog: Apollo 17 lander and flag! An image of the Apollo 17 lander, Challenger, captured by the Lunar Reconnaissance Orbiter Camera on October 1, 2009, includes not only the lander and astronaut tracks but also a fuzzy dark pixel at the location of the American flag erected there by astronauts Jack Schmitt and Gene Cernan.

Thursday, October 29, 2009

CS: Linear Systems, Sparse Solutions, and Sudoku, Compressed sensing performance bounds under Poisson noise


Angshul Majumdar just pointed out to me the following paper entitled: Linear Systems, Sparse Solutions, and Sudoku by Prabhu Babu, Kristiaan Pelckmans, Petre Stoica, and Jian Li. The abstract reads:
In this paper, we show that Sudoku puzzles can be formulated and solved as a sparse linear system of equations. We begin by showing that the Sudoku ruleset can be expressed as an underdetermined linear system: ${mmb{Ax}}={mmb b}$, where ${mmb A}$ is of size $mtimes n$ and $n>m$. We then prove that the Sudoku solution is the sparsest solution of ${mmb{Ax}}={mmb b}$, which can be obtained by $l_{0}$ norm minimization, i.e. $minlimits_{mmb x}Vert{mmb x}Vert_{0}$ s.t. ${mmb{Ax}}={mmb b}$. Instead of this minimization problem, inspired by the sparse representation literature, we solve the much simpler linear programming problem of minimizing the $l_{1}$ norm of ${mmb x}$, i.e. $minlimits_{mmb x}Vert{mmb x}Vert_{1}$ s.t. ${mmb{Ax}}={mmb b}$, and show numerically that this approach solves representative Sudoku puzzles.
We have heard about Sudoku before in compressive sensing. The code implementing this can be found here.




This paper describes performance bounds for compressed sensing (CS) where the underlying sparse or compressible (sparsely approximable) signal is a vector of nonnegative intensities whose measurements are corrupted by Poisson noise. In this setting, standard CS techniques cannot be applied directly for several reasons. First, the usual signal-independent and/or bounded noise models do not apply to Poisson noise, which is non-additive and signal-dependent. Second, the CS matrices typically considered are not feasible in real optical systems because they do not adhere to important constraints, such as nonnegativity and photon flux preservation. Third, the typical $\ell_2$--$\ell_1$ minimization leads to overfitting in the high-intensity regions and oversmoothing in the low-intensity areas. In this paper, we describe how a feasible positivity- and flux-preserving sensing matrix can be constructed, and then analyze the performance of a CS reconstruction approach for Poisson data that minimizes an objective function consisting of a negative Poisson log likelihood term and a penalty term which measures signal sparsity. We show that, as the overall intensity of the underlying signal increases, an upper bound on the reconstruction error decays at an appropriate rate (depending on the compressibility of the signal), but that for a fixed signal intensity, the signal-dependent part of the error bound actually grows with the number of measurements or sensors. This surprising fact is both proved theoretically and justified based on physical intuition.

On his webpage, Roummel Marcia has the following announcement:
Opportunities: Positions for graduate and undergraduate students are available in the areas of optimization, linear algebra, and compressed sensing. These positions are in conjunction with the National Science Foundation grant, DMS-0811062: Second-order methods for large-scale optimization in compressed sensing. Send me an email if you are interested in any of these positions.

Wednesday, October 28, 2009

CS: Super-resolution ghost imaging via compressive sampling reconstruction


Bada bing, bada boom, when was the last time you had to increase the size of your pixel on a CCD dye in order to obtain a better resolution ? Looks like this is a result coming out of a CS reconstruction in another Ghost Imaging experiment. Enjoy !

Super-resolution ghost imaging via compressive sampling reconstruction by Wenlin Gong and Shensheng Han. The abstract reads:
For ghost imaging, pursuing high resolution images and short acquisition times required for reconstructing images are always two main goals. We report an image reconstruction algorithm called compressive sampling (CS) reconstruction to recover ghost images. By CS reconstruction, ghost imaging with both super-resolution and a good signal-to-noise ratio can be obtained via short acquisition times. Both effect influencing and approaches further improving the resolution of ghost images via CS reconstruction, relationship between ghost imaging and CS theory are also discussed.

Tuesday, October 27, 2009

CS: A job with Exxon.

As much as it looks like a nice opportunity, this one is not located in Paris so I will pass up on it. Here is a job opportunity that showed up on my radar screen which has been added to the Compressive Sensing Jobs page:

Employer: ExxonMobil
Location: Clinton, NJ United States
Last Updated: 10/26/2009
Job Code: 7382BR

AutoReqId 7382BR
Job or Campus Folder Research Scientists (2) -Wave Propagation and Numerical Methods


Job Description ExxonMobil's Corporate Strategic Research Laboratory is seeking applications from talented individuals in physics, applied mathematics, or engineering with a strong record of achievements in fields related to wave propagation in heterogeneous media, and its associated mathematical and numerical methods.


Research Scientist - Wave propagation and transport in heterogeneous and porous media.

Experience with the physical, mathematical and numerical analyses of one of: wave propagation in heterogeneous media, multi-scale transport phenomena, and/or fluid-structure interactions at the rock pore scale and their impact on the macro scale.

Research Scientist-Compressive sensing.

Experience with the mathematics of signal/image processing, denoising, sub-Nyquist signal reconstruction, and sparse representation of data.

Applicants should have a PhD in applied mathematics, physics, engineering, geophysics, or a related field, with a strong ability in their field of expertise. Proficiency with scientific programming languages and experience with large-scale, parallel, numerical simulations are definite advantages. The ability to communicate and interact with internal and external groups will be an important selection criterion. Candidates should have a strong publication record, excellent oral presentation and writing skills, and show a desire and ability to grow into new science areas.

The successful candidate will join a dynamic, multi-disciplinary group of world-class scientists who focus on performing breakthrough research and creating new approaches to solve our most challenging problems. Technical staff members in this position implement and report on independent research, participate in program development, as well as collaborate internationally with leading engineers and scientists from industry, universities and other technical institutions.

ExxonMobil's Corporate Strategic Research (CSR) laboratory is a powerhouse in energy research focusing on fundamental science that can lead to technologies having a direct impact on solving our biggest energy challenges. Our facilities are centrally located in scienic Annandale, New Jersey, approximately one hour from both New York City and Phildelphia.
ExxonMobil is an Equal Opportunity Employer
Job Location Clinton, NJ

Monday, October 26, 2009

CS: Making CS Mainstream, CS Data Recovery in Wireless Sensor Networks, Routing and Signal for CS in WSNs, Very High Resolution SAR, Postdoc


One of the ways to make compressive sensing more mainstream is to enable tinkerers to play with something that implements some sorts of controlled multiplexing. Case in point, what could be done with the newly released tiny DMD based projectors or these ? and what about this DMD kit with no driver ? then there is always the more expensive DLP kits from TI ? By the way, if you are wondering how much an SLM cost, such the one used in the work of by Ivo Vellekoop and Allard Mosk ( Phase control algorithms for focusing light through turbid media ) that is about 3,000 US$ and up.

Today, we have papers and jobs mostly from Europe, enjoy!

A Bayesian Analysis of Compressive Sensing Data Recovery in Wireless Sensor Networks by Riccardo Masiero, Giorgio Quer, Michele Rossi and Michele Zorzi. The abstract reads:
In this paper we address the task of accurately reconstructing a distributed signal through the collection of a small number of samples at a data gathering point using Compressive Sensing (CS) in conjunction with Principal Component Analysis (PCA). Our scheme compresses in a distributed way real world non-stationary signals, recovering them at the data collection point through the online estimation of their spatial/temporal correlation structures. The proposed technique is hereby characterized under the framework of Bayesian estimation, showing under which assumptions it is equivalent to optimal maximum a posteriori (MAP) recovery. As the main contribution of this paper, we proceed with the analysis of data collected by our indoor wireless sensor network (WSN) testbed, proving that these assumptions hold with good accuracy in the considered real world scenarios. This provides empirical evidence of the effectiveness of our approach and proves that CS is a legitimate tool for the recovery of real-world signals in WSNs.

On the Interplay Between Routing and Signal Representation for Compressive Sensing in Wireless Sensor Networks by Giorgio Quer, Riccardo Masiero, Daniele Munaretto, Michele Rossi, Joerg Widmer and Michele Zorzi. The abstract reads:
Compressive Sensing (CS) shows high promise for fully distributed compression in wireless sensor networks (WSNs). In theory, CS allows the approximation of the readings from a sensor field with excellent accuracy, while collecting only a small fraction of them at a data gathering point. However, the conditions under which CS performs well are not necessarily met in practice. CS requires a suitable transformation that makes the signal sparse in its domain. Also, the transformation of the data given by the routing protocol and network topology and the sparse representation of the signal have to be incoherent, which is not straightforward to achieve in real networks. In this work we address the data gathering problem in WSNs, where routing is used in conjunction with CS to transport random projections of the data.We analyze synthetic and real data sets and compare the results against those of random sampling. In doing so, we consider a number of popular transformations and we find that, with real data sets, none of them are able to sparsify the data while being at the same time incoherent with respect to the routing matrix. The obtained performance is thus not as good as expected and finding a suitable transformation with good sparsification and incoherence properties remains an open problem for data gathering in static WSNs.

At Fringe 2009, a meeting organized by ESA in Italy the following paper will be presented (only the abstract is available): Very High Resolution SAR Tomography via Compressive Sensing by Zhu Xiaoxiang and Bamler Richard. The abstract reads:
SAR tomography (TomoSAR) extends the synthetic aperture principle into the elevation direction for 3-D imaging. It uses stacks of several acquisitions from slightly different viewing angles (the elevation aperture) to reconstruct the reflectivity function along the elevation direction by means of spectral analysis for every azimuth-range pixel.

The new class of meter-resolution spaceborne SAR systems (TerraSAR-X and COSMO-Skymed) offers a tremendous improvement in tomographic reconstruction of urban areas and man-made infrastructure. The high resolution fits well to the inherent scale of buildings (floor height, distance of windows, etc.). In order to fully exploit the potential of this class of meter-resolution data there is demand for new and improved TomoSAR inversion algorithms.

Compressive sensing (CS) is a new and attractive technique for TomoSAR. It aims at minimizing the number of measurements to be taken from signals while still retaining the information necessary to approximate them well. It provides a good compromise between classical parametric and non-parametric spectral analysis methods for TomoSAR. Compared to parametric spectral analysis, CS is more robust to phase noise, has lower computational effort, and does not require model selection to provide the prior knowledge about the number of scatterers in a resolution cell. Compared to non-parametric spectral estimation CS overcomes the limitation of elevation resolution caused by the length of elevation aperture, i.e. CS has super-resolution properties.

In this paper the CS approach to TomoSAR is outlined. Its extension to differential (4-D) TomoSAR is introduced. Numerical simulations for realistic acquisition and noise scenarios will be presented to evaluate the potential and limits of the new technique. The first CS TomoSAR results with TerraSAR-X spotlight data (1 m resolution) over urban areas will be presented.

Finally, here a postdoc job offer that I will be put shortly on the Compressive Sensing Jobs page:
Postdoctoral Position in Wireless Communications at University of Vigo, Spain,
Description
The Signal Processing in Communications Group (GPSC, www.gts.tsc.uvigo.es/gpsc ), affiliated with the Department of Signal Theory and Communications at University of Vigo, Spain, invites applications for one postdoctoral positions in the field of wireless communications. The selected candidates will join GPSC to investigate fundamentals and algorithm design/evaluation for communication and sensor networks. Areas of particular interest include:
  • Sensor networks
  • Cognitive Radio
  • Compressed Sensing
GPSC is formed by 6 faculty members from the University of Vigo as well as several MSc and PhD students, and participates in several research projects funded by the European Commission and the Spanish Government. Among these, the recently launched COMONSENS project (www.comonsens.org) integrates investigators from 10 different top research institutions in Spain. GPSC members also actively collaborate with the Galician R&D Center in Advanced Telecommunications (GRADIANT, www.gradiant.org ) in diverse contracts with ICT companies. Thus, the selected candidates will enjoy unique opportunities to participate in exciting research projects with both industry and academia.

Desirable background
  • A Ph.D. degree in Electrical Engineering is required; PhD obtained before January 1, 2007
  • Knowledge and experience in sensor networks, cognitive radio and/or compressed sensing
  • Good verbal and written skills in English are required
  • Strong publications in international conferences and journals in the area of communications
  • Postdoctoral experience in a recognized group with expertise in the field is a plus
  • Experience in the organization, management and training of technical staff/students is a plus
  • Communication, computing and interpersonal skills are important
  • Capacity to work both independently and within a team
Contract conditions
The initial appointment will be for one year, with annual renewal dependent on funding and performance. Expected start date is January 2010. Successful applicants will be offered a 36,000 € yearly gross salary depending on qualifications, as well as health benefits.
Applications
Interested candidates may apply to Prof. Nuria González Prelcic (nuria@gts.tsc.uvigo.es).
Applications should include electronic copies of the following:
  • A detailed Curriculum Vitae. (*Please include your e-mail address and a recent picture.)
  • A cover letter addressing the specified job qualifications.
  • A letter of recommendation by a senior Professor/Researcher.
  • A copy of the publication deemed as best representative of the candidate’s creative research.
Priority consideration will be given to applications received by November 1, 2009. Applications will be accepted until position is filled.
Contact: Nuria González Prelcic
Departamento de Teoría de la Señal y Comunicaciones
ETSET. University of Vigo
36310 Vigo. Spain
Phone: +34 986 818656, e-mail: nuria@gts.tsc.uvigo.es


Credit: Palomar Observatory, This image of the moon was taken by the Palomar Observatory at the time of the LCROSS impact.

Saturday, October 24, 2009

CS: Compressed Sensing for Fusion Frames, Combinatorial Compressed Sensing, The Geometry of Generalized Binary Search, Ranging Imager,TCS and finance.

Here are a few papers of related interest for the week-end.


Compressed Sensing for Fusion Frames by Petros Boufounos, Gitta Kutyniok, and Holger Rauhut. The abstract reads:
Compressed Sensing (CS) is a new signal acquisition technique that allows sampling of sparse signals using significantly fewer measurements than previously thought possible. On the other hand, a fusion frame is a new signal representation method that uses collections of subspaces instead of vectors to represent signals. This work combines these exciting new fields to introduce a new sparsity model for fusion frames. Signals that are sparse under the new model can be compressively sampled and uniquely reconstructed in ways similar to sparse signals using standard CS. The combination provides a promising new set of mathematical tools and signal models useful in a variety of applications. With the new model, a sparse signal has energy in very few of the subspaces of the fusion frame, although it needs not be sparse within each of the subspaces it occupies. We define a mixed `1/`2 norm for fusion frames. A signal sparse in the subspaces of the fusion frame can thus be sampled using very few random projections and exactly reconstructed using a convex optimization that minimizes this mixed `1/`2 norm. The sampling conditions we derive are very similar to the coherence and RIP conditions used in standard CS theory.


Then there is a presentation by Mark Iwen on Combinatorial Compressed Sensing: Fast algorithms with Recovery Guarantees. Of noted interest is the application section at the very end where one considers learning a function with multiple parameters:


A situation that is also the subject of much interest in calibration or machine learning in general. Let us note that another way of performing some type of function learning is through the Experimental Probabilistic Hypersurface (more details can be found here). Not really related to compressive sensing but of interest nonetheless since it deals with learning functions:

The Geometry of Generalized Binary Search by Robert Nowak. The abstract reads:
This paper investigates the problem of determining a binary-valued function through a sequence of strategically selected queries. The focus is an algorithm called Generalized Binary Search (GBS). GBS is a well-known greedy algorithm for determining a binary-valued function through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search and Shannon-Fano coding. GBS is used in many applications including channel coding, experimental design, fault testing, machine diagnostics, disease diagnosis, job scheduling, image processing, computer vision, and machine learning. This paper develops novel incoherence and geometric conditions under which GBS achieves the information-theoretically optimal query complexity; i.e., given a collection of N hypotheses, GBS terminates with the correct function in O(logN) queries. Furthermore, a noise tolerant version of GBS is developed that also achieves the optimal query complexity. These results are applied to learning multidimensional threshold functions, a problem arising routinely in image processing and machine learning.


And here are two architectures for cameras to determine ranges:
They are interesting in that they don't have your usual Point Spread Function (or transfer function).

Finally, I have always been impressed by some results in TCS (Theoretical Computer Science) and always wonders what type of impact they will have in the civil society, here are two in particular:


At some point, I am sure we'll also see some application of Compressive Sensing in the area of low latent trading or high or ultra-high frequency trading as it is called these days.

Friday, October 23, 2009

CS: Learning with Compressible Priors, Spatially-Localized Compressed Sensing and Routing in Multi-Hop Sensor Networks?, Learning low dim manifolds

If you recall, one of the reason Compressive Sensing is bound to touch on many subjects and fields of engineering ( see sparsity in everything series of posts) lies in the fact that most occurrences in Nature follow some type of power law. Volkan Cevher provides some insight on the subject of signal sparsity and the probability distribution from which they are sampled from in his upcoming paper at NIPS entitled: Learning with Compressible Priors. The abstract reads:

We describe a set of probability distributions, dubbed compressible priors, whose independent and identically distributed (iid) realizations result in p-compressible signals. A signal x element of R^N is called p-compressible with magnitude R if its sorted coefficients exhibit a power-law decay as |x|_(i) . R i^(-d), where the decay rate d is equal to 1/p. p-compressible signals live close to K-sparse signals (K \lt\lt N) in the l_r-norm (r \gt p) since their best K-sparse approximation error decreases with O (R K^(1/r-1/p) ) We show that the membership of generalized Pareto, Student’s t, log-normal, Frechet, and log-logistic distributions to the set of compressible priors depends only on the distribution parameters and is independent of N. In contrast, we demonstrate that the membership of the generalized Gaussian distribution (GGD) depends both on the signal dimension and the GGD parameters: the expected decay rate of N-sample iid realizations from the GGD with the shape parameter q is given by 1/[q log (N/q)]. As stylized examples, we show via experiments that the wavelet coefficients of natural images are 1.67-compressible whereas their pixel gradients are 0:95 log (N/0.95)-compressible, on the average. We also leverage the connections between compressible priors and sparse signals to develop new iterative re-weighted sparse signal recovery algorithms that outperform the standard l_1-norm minimization. Finally, we describe how to learn the hyperparameters of compressible priors in underdetermined regression problems.

Volkan Cevher also makes available RANDSC, a small code generating compressible signals from a specified distribution. If we could now make a connection between these distributions and the l_q ( q less than 1) minimization techniques used to recover signals, it would be great[Oops, let me take that back, Volkan points to section 5.2 entitled " Iterative l_s-decoding for iid scale mixtures of GGD", duh]

Also found via another blog: Spatially-Localized Compressed Sensing and Routing in Multi-Hop Sensor Networks? by Sungwon Lee, Sundeep Pattem, Maheswaran Sathiamoorthy, Bhaskar Krishnamachari, and Antonio Ortega. The abstract reads:

We propose energy-efficient compressed sensing for wireless sensor networks using spatially-localized sparse projections. To keep the transmission cost for each measurement low, we obtain measurements from clusters of adjacent sensors. With localized projection, we show that joint reconstruction provides signi cantly better reconstruction than independent reconstruction. We also propose a metric of energy overlap between clusters and basis functions that allows us to characterize the gains of joint reconstruction for di erent basis functions. Compared with state of the art compressed sensing techniques for sensor network, our experimental results demonstrate signi cant gains in reconstruction accuracy and transmission cost.
Finally, I am rebounding on yesterday's statement on Machine Learning and Compressive Sensing here is a view of some of the subject in ML and Manifolds featured in the recent talk by Yoav Freund entitled Learning low dimensional manifolds and presented at Google (by the way, what's up with Google engineers who don't know their mics are on ? uh ?)




What is interesting is his use of manifold for system calibration at 52 minutes into the video where he describes the 3 dimensional manifold living in a 23-dimension dataset. Yoav's project described in the video is the Automatic Cameraman.

Thursday, October 22, 2009

CS: NIPS Workshop on Manifolds, sparsity, and structured models, Active Learning, Random Coding, FRI,


Ever since that blog entry featuring work on manifold signal processing and CS , I have had some expectation of some type of integration of compressive sensing in machine learning topics beyond simply publications. It looks like 2009 is the year when this is happening in the form of workshops within an ML conference. First, Mike Wakin is co-organizing a NIPS workshop on Low-dimensional geometric models along with Richard Baraniuk, Piotr Indyk, Bruno Olshausen, Volkan Cevher, and Mark Davenport,. The call for contributions for that workshop follows:


CALL FOR CONTRIBUTIONS
NIPS Workshop on Manifolds, sparsity, and structured models: When can low-dimensional geometry really help?

Whistler, BC, Canada, December 12, 2009

http://dsp.rice.edu/nips-2009


Important Dates:
----------------
  • Submission of extended abstracts: October 30, 2009 (later submission might not be considered for review)
  • Notification of acceptance: November 5, 2009
  • Workshop date: December 12, 2009

Overview:
---------
Manifolds, sparsity, and other low-dimensional geometric models have long been studied and exploited in machine learning, signal processing and computer science. For instance, manifold models lie at the heart of a variety of nonlinear dimensionality reduction techniques. Similarly, sparsity has made an impact in the problems of compression, linear regression, subset selection, graphical model learning, and compressive sensing. Moreover, often motivated by evidence that various neural systems are performing sparse coding, sparse representations have been exploited as an efficient and robust method for encoding a variety of natural signals. In all of these cases the key idea is to exploit low-dimensional models to obtain compact representations of the data.

The goal of this workshop is to find commonalities and forge connections between these different fields and to examine how we can we exploit low-dimensional geometric models to help solve common problems in machine learning and beyond.

Submission instructions:
------------------------

We invite the submission of extended abstracts to be considered for a poster presentation at this workshop. Extended abstracts should be 1-2 pages, and the submission does not need to be blind. Extended abstracts should be sent to md@rice.edu in PDF or PS file format.

Accepted extended abstracts will be made available online at the workshop website.

Organizers:
-----------
* Richard Baraniuk, Volkan Cevher, Mark Davenport, Rice University.
* Piotr Indyk, MIT.
* Bruno Olshausen, UC Berkeley.
* Michael Wakin, Colorado School of Mines.

Second, Rui Castro is one of the organizer of a NIPS workshop on Adaptive Sensing, Active Learning and Experimental Design. From the NIPS workshop webpage:
  • Submission of extended abstracts: October 27, 2009
    (later submission might not be considered for review)
  • Notification of acceptance: November 5, 2009
  • Workshop date: December 11, 2009

Also found on the interwebs:

Channel protection: Random coding meets sparse channels, by M. Salman Asif, William Mantzel and Justin Romberg. The abstract reads:
Multipath interference is an ubiquitous phenomenon in modern communication systems. The conventional way to compensate for this effect is to equalize the channel by estimating its impulse response by transmitting a set of training symbols. The primary drawback to this type of approach is that it can be unreliable if the channel is changing rapidly. In this paper, we show that randomly encoding the signal can protect it against channel uncertainty when the channel is sparse. Before transmission, the signal is mapped into a slightly longer codeword using a random matrix. From the received signal, we are able to simultaneously estimate the channel and recover the transmitted signal. We discuss two schemes for the recovery. Both of them exploit the sparsity of the underlying channel. We show that if the channel impulse response is sufficiently sparse, the transmitted signal can be recovered reliably.

and Sparse Sampling of Structured Information and its Application to Compression, by Pier Luigi Dragotti. The abstract reads:
It has been shown recently that it is possible to sample classes of non-bandlimited signals which we call signals with Finite Rate of Innovation (FRI). Perfect reconstruction is possible based on a set of suitable measurements and this provides a sharp result on the sampling and reconstruction of sparse continuous-time signals. In this paper, we first review the basic theory and results on sampling signals with finite rate of innovation. We then discuss variations of the above framework to handle noise and model mismatch. Finally, we present some results on compression of piecewise smooth signals based on the FRI framework.





Image Credit: NASA/JPL/Space Science Institute, W00060562.jpg was taken on October 19, 2009 and received on Earth October 20, 2009. The camera was pointing toward SATURN at approximately 2,174,289 kilometers away, and the image was taken using the CB2 and CL2 filters.

Wednesday, October 21, 2009

CS: Workshop on Sparsity and Modern Mathematical Methods for High Dimensional Data, Learning Deep Architectures for AI

One of the use of focusing light in turbid tissues could easily be used in system like this one. In a different area, I have always wondered why when one delivers a certain radiation dose to a particular area of the body, additional sensors were not arranged around that would perform an imaging task at the same time. Food for thoughts. Anyway, there will be an Interdisciplinary Workshop on Sparsity and Modern Mathematical Methods for High Dimensional Data on April 6--10, 2010 in Brussels, Belgium.

While reading Learning Deep Architectures for AI by Yoshua Bengio. The abstract reads;

Theoretical results suggest that in order to learn the kind of complicated functions that can represent high level abstractions (e.g. in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
I noted a nice discussion about compressive sensing in paragraph 7.1 entitled "Sparse Representations in Auto-Encoders and RBMs"


Credit: NASA / JPL / SSI / colorization by Gordan Ugarkovic, Tethys and Titan, Cassini captured a grayscale animation of Tethys crossing in front of Titan on October 17, 2009. In this version, Gordan Ugarkovic has colored in Titan based on its color as seen in previous Cassini photos.

Tuesday, October 20, 2009

CS: Lower Bounds for Sparse Recovery, Super-resolution, Sparse Multipath Channels, RIP for Structurally-Subsampled Unitary Matrices

The photo on the side shows the resulting plume as seen from the LCROSS cameras.

In a different direction, here is a thing I did not know about SSDs. As opposed to Hard Disk Drives, SSDs can allow parallel access to different jobs running on the CPU. SSDs also allow faster access to the data in memory (that part I knew). I wonder how this could help some of the reconstruction processes in Compressive Sensing ?

Jean-Luc Stark will give a talk at Saclay today, let us hope that the audience will get to hear if the initial tests of Compressive Sensing encoding of the PACS camera on Herschel are good.

Now let's go back to papers. I mentioned the abstract earlier, but now the paper is now available: Lower Bounds for Sparse Recovery by Khanh Do Ba, Piotr Indyk, Eric Price and David Woodruff. The abstract reads:
We consider the following k-sparse recovery problem: design an m * n matrix A, such that for any signal x, given Ax we can efficiently recover x^ satisfying ||x - x^||_1 \le C min_k-sparse x0 ||x - x'||_1. It is known that there exist matrices A with this property that have only O(k log(n/k)) rows. In this paper we show that this bound is tight. Our bound holds even for the more general randomized version of the problem, where A is a random variable, and the recovery algorithm is required to work for any fixed x with constant probability (over A).
I mentioned this approach before as it was featured in a video online, here is the associated paper: Super-Resolution with Sparse Mixing Estimators by Stephane Mallat and Guoshen Yu. The abstract reads:
We introduce a class of inverse problem estimators computed by mixing adaptively a family of linear estimators corresponding to different priors. Sparse mixing weights are calculated over blocks of coefficients in a frame providing a sparse signal representation. They minimize an l1 norm taking into account the signal regularity in each block. Adaptive directional image interpolations are computed over a wavelet frame with an O(N logN) algorithm.
According to the paper, the SME algorithm is at http://www.cmap.polytechnique.fr/~mallat/SME.html, but it is not there for the moment.

Also found on the interwebs:

In this paper, we revisit the problem of signal detection in multipath environments. Existing results implicitly assume a rich multipath environment. Our work is motivated by physical arguments and recent experimental results that suggest physical channels encountered in practice exhibit a sparse structure, especially at high signal space dimension (i.e., large time-bandwidth product). We first present a model for sparse channels that quantifies the independent degrees of freedom (DoF) in the channel as a function of the physical propagation environment and signal space dimension. The number of DoF represents the delay-Doppler diversity afforded by the channel and, thus, critically impacts detection performance. Our focus is on two types of non-coherent detectors: the energy detector (ED) and the optimal non-coherent detector (ONCD) that assumes full knowledge of channel statistics. Results show, for a uniform distribution of paths in delay and Doppler, the channel exhibits a rich structure at low signal space dimension and then gets progressively sparser as this dimension is increased. Consequently, the performance of the detectors is identical in the rich regime. As the signal space dimension is increased and the channel becomes sparser, the ED suffers significant degradation in performance relative to the ONCD. Finally, our results show the existence of an optimal signal space dimension - one that yields the best detection performance - as a function of the physical channel characteristics and the operating signal to noise ratio (SNR).
After the result on Toeplitz matrices, which eventually could be used for coded aperture here is a new paper for measurement matrices having other properties: A Restricted Isometry Property for Structurally-Subsampled Unitary Matrices by Waheed Bajwa, Akbar Sayeed and Robert Nowak, The abstract reads:
Subsampled (or partial) Fourier matrices were originally introduced in the compressive sensing literature by Candes et al. Later, in papers by Candes and Tao and Rudelson and Vershynin, it was shown that (random) subsampling of the rows of many other classes of unitary matrices also yield effective sensing matrices. The key requirement is that the rows of U, the unitary matrix, must be highly incoherent with the basis in which the signal is sparse. In this paper, we consider acquisition systems that—despite sensing sparse signals in an incoherent domain—cannot randomly subsample rows from U. We consider a general class of systems in which the sensing matrix corresponds to subsampling of the rows of matrices of the form  = RU (instead of U), where R is typically a low-rank matrix whose structure reflects the physical/technological constraints of the acquisition system. We use the term “structurally-subsampled unitary matrices” to describe such sensing matrices. We investigate the restricted isometry property of a particular class of structurally-subsampled unitary matrices that arise naturally in application areas such as multiple-antenna channel estimation and sub-nyquist sampling. In addition, we discuss an immediate application of this work in the area of wireless channel estimation, where the main results of this paper can be applied to the estimation of multiple-antenna orthogonal frequency division multiplexing channels that have sparse impulse responses.

Finally, here is a recent presentation by Toh Kim Chuan on An accelerated proximal gradient method for nuclear norm minimization.

LCROSS Zoomed in image of the impact plume. The extent of the plume at 15 sec is approximately 6-8 km in diameter. Credit: NASA Click for full resolution.

Monday, October 19, 2009

CS: KSVDS-Box and OMPS-Box, Simultaneous Sparse Approximation, Inferring Ranking using Constrained Sensing, Justice Pursuit

Another blogger who went to the OSA meeting on top of David Brady blogged about the meeting and his encounter with compressive sensing and ghost imaging :-). On top of the previous answers, Angshul Majumdar seems to think that a "very neat answer to his question" is the Sparse Signal Restoration course by Ivan Selesnick at: http://cnx.org/content/m32168/latest/

Rebounding on Lianlin Li's blog entry (English translation here) of this week-end that features the second reference of this entry namely Phase control algorithms for focusing light through turbid media by Ivo Vellekoop and Allard Mosk and discusses Sparse Bayesian Algorithm, I allso want to show two figures from that paper:

There is a discussion of three algorithms for focusing light through some random medium by using different set of configurations for the Spatial Light Modulator (SLM). The first set is equivalent to the usual raster mode, while the third one has a random permutation component to it which is not unlike what we have in Compressive Sensing. The convergence of these algorithms can be shown in the following figures:

One final note on this work is that it is more or less equivalent to how one used to solve coded aperture problems and this is why I think CS reconstruction algorithms might provide faster way of focusing light.

Ron Rubinstein has released the KSVDS-Box and OMPS-Box packages..." These packages implement the OMP and K-SVD algorithms for sparse dictionaries, as introduced in their paper Double Sparsity - Learning Sparse Dictionaries for Sparse Signal Approximation (see below). He also published the packages KSVD-Boxv13 and OMP-Box v10. These new versions reduce memory consumption, accelerate computation, and resolve a few minor bugs..."

  • OMP-Box v10 Implementation of the Batch-OMP and OMP-Cholesky algorithms for quick sparse-coding of large sets of signals.
  • OMPS-Box v1 Implementation of the Batch-OMP and OMP-Cholesky algorithms for sparse dictionaries.
  • KSVD-Box v13 Implementation of the K-SVD and Approximate K-SVD dictionary training algorithms, and the K-SVD Denoising algorithm. Requires OMP-Box v10.
  • KSVDS-Box v11 Implementation of the sparse K-SVD dictionary training algorithm and the sparse K-SVD Denoising algorithm. Requires OMPS-Box v1. The package is also available without demo volumes (less recommended) at KSVDS-Box v11-min.
The paper is : Double Sparsity: Learning Sparse Dictionaries for Sparse Signal Approximation by Ron Rubinstein, Michael Zibulevsky and Michael Elad. The abstract reads:
An efficient and flexible dictionary structure is proposed for sparse and redundant signal representation. The proposed sparse dictionary is based on a sparsity model of the dictionary atoms over a base dictionary, and takes the form D = \Phi A where \Phi is a fixed base dictionary and A is sparse. The sparse dictionary provides efficient forward and adjoint operators, has a compact representation, and can be effectively trained from given example data. In this, the sparse structure bridges the gap between implicit dictionaries, which have efficient implementations yet lack adaptability, and explicit dictionaries, which are fully adaptable but non-efficient and costly to deploy. In this paper we discuss the advantages of sparse dictionaries, and present an efficient algorithm for training them. We demonstrate the advantages of the proposed structure for 3-D image denoising.
Here are a set of papers I found on the interwebs: Simultaneous Sparse Approximation : insights and algorithms by Alain Rakotomamonjy. The abstract reads:
This paper addresses the problem of simultaneous sparse approximation of signals, given an overcomplete dictionary of elementary functions, with a joint sparsity profile induced by a ℓp − ℓq mixednorm. Our contributions are essentially two-fold i) making connections between such an approach and other methods available in the literature and ii) on providing algorithms for solving the problem with different values of p and q. At first, we introduce a simple algorithm for solving the multiple signals extension of the Basis Pursuit Denoising problem (p = 1 and q = 2). Then, we show that for general sparsity-inducing ℓp − ℓq mixed-norm penalty, this optimization problem is actually equivalent to an automatic relevance determination problem. From this insight, we derive an simple EM-like algorithm for problems with ℓ1 − ℓq2 penalty. For addressing approximation problem with non-convex penalty (p \lt 1), we propose an iterative reweighted Multiple-Basis Pursuit ; we trade the non-convexity of the problem against several resolutions of the convex multiple-basis pursuit problem. Relations between such a reweighted algorithm and the Multiple-Sparse Bayesian Learning are also highlighted. Experimental results show how our algorithms behave and how they compare to related approaches (such as CosAmp) for solving simultaneous sparse approximation problem.
and Inferring Ranking using Constrained Sensing by Srikanth Jagabathula and Devavrat Shah. The abstract reads:
We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over n elements from a given partial set of its Fourier series coefficients. This problem naturally arises in several applications such as ranked elections, multi-object tracking, ranking systems and recommendation systems. This problem has been widely studied in the context of discrete-time functions in the recently popular compressive sensing literature; however, the existing approaches don’t naturally extend to our problem setup. Inspired by the work of Donoho and Stark [?] in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size \lt\lt domain size). Our recovery method is based on finding the sparsest solution (through ℓ0 optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for the functions that can be recovered exactly from a partial set of Fourier coefficients through ℓ0 optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as n → ∞. Since ℓ0 optimization is computationally hard, its convex relaxation, ℓ1 optimization, is typically considered; however, we show that ℓ1 optimization fails to recover a function generated using the random model with a high probability as n → ∞. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study the necessity conditions for exact recovery to be possible.
The related shorter NIPS08 paper is here. From the paper:
...Assuming that the function is sparse, our approach to performing exact recovery is to find the function with the sparsest support that is consistent with the given partial information, henceforth referred to as ℓ0 optimization. This approach is often justified by the philosophy of Occam’s razor. However, as illustrated in 2.3.1, the sparsest solution is not always the correct solution. Thus, we derive sufficient conditions in terms of sparsity (support size) for functions that can be recovered through ℓ0 optimization. Furthermore, finding a function with the sparsest support through ℓ0 minimization is in general computationally hard. This problem is typically overcome by considering the convex relaxation of the ℓ0 optimization problem. However, as we show in Theorem 3.1, such a convex relaxation does not yield exact recovery in our case. Thus, we propose a simple iterative algorithm and prove that the algorithm performs exact recovery of functions that satisfy the sufficient conditions.
Our work can be considered as an extension of the compressive sensing literature to non-commutative groups. To the best of our knowledge, our work is the first to consider exact recovery of functions over non-commutative groups from a partial set of Fourier coefficients.

They are lots of interesting things, however further down, one can read the following:
The sparsest recovery approach of this paper is similar (in flavor) to the above stated work. However, the methods or approaches of the prior work do not apply. In a nutshell, in our setup too, we are interested in recovering a certain sparse vector x from data y = Ax. However, the corresponding matrix A is given rather than a design choice. Moreover, the matrix A is dependent on the structure of the space of permutations. An important development of the above stated work is the characterization of a class of sufficient conditions (on the structure of A) for recovery, collectively known as the Restricted Isoperimetric Property (RIP) (see [CRT06b, BGI+08]) of matrix A. However, such sufficient conditions trivially fail in our setup (see [JS08]). Therefore, a new approach is required.

Does null space property also fail in their set-up ? I guess that takes us back to the discussion we had last week.

And finally from the Rice Compressive Sensing repository, we have:

Exact signal recovery from sparsely corrupted measurements through the pursuit of justice by Jason Laska, Mark Davenport, and Richard Baraniuk. The abstract reads:
Compressive sensing provides a framework for recoveringsparse signals of length N from M \lt\lt N measurements. If the measurements contain noise bounded by \epsilon, then standard algorithms recover sparse signals with error at most C \epsilon However, these algorithms perform suboptimally when the measurement noise is also sparse. This can occur in practice due to shot noise, malfunctioning hardware, transmission errors, or narrowband interference. We demonstrate that a simple algorithm, which we dub Justice Pursuit (JP), can achieve exact recovery from measurements corrupted with sparse noise. The algorithm handles unbounded errors, has no input parameters, and is easily implemented via standard recovery techniques.

Friday, October 16, 2009

CS: Food for thoughts for the week-end.

Before all of you go home for the week-end, let me add one more thing to the entry of yesterday and then some. Even though I suck, I don't think I am the only one who is going to save that entry, print it and parse it over the week-end. Looks like Giuseppe Paleologo is going to do the same but I am sure we won't be the only ones. Giuseppe added the following in the comment section of yesterday's entry:

I have to say that it's quite amazing that a tweet I shot to Igor cascaded in two detailed replies from none other than Jared Tanner and Remi Gribonval. Thanks to Jared for the nice geometric intuition provided for the NSP. I still have to digest his comment, and his recent paper with Blanchard and Cartis.

Briefly, I should mention that the question of RIP vs NSP is interesting to me because in statistical applications where A is observational, RIP is not obviously satisfied. However, even when it is not satisfied, l1 minimization or lasso perform well, so possibly there is a different explanation for their success. In this respect, I am especially interested in stable recovery under the two assumptions.

I have been reading some papers and presentations by Remi, and have been thinking about his statement: "however the RIP seems necessary to guarantee robustness to noise" (which appears in the recent IEEE paper). In most papers this seems to be the case, because stability depends obviously on the metric properties of the encoder, which RIP captures. The (lone?) exception is a recent paper of Yin Zhang (http://www.caam.rice.edu/~zhang/reports/tr0811_revised.pdf), which characterizes stability not using RIP or the canonical NSP, but still in terms of projections on the null space. The quantity of interest in that analysis is the value of (||x||_1/||x||_2) for x \in N(A), which is different from NSP......
thanks Giuseppe !

By the way, don't miss Giuseppe's very interesting entry (and the comments) on being an Intern at Enron. As a person who was nearby Houston at that time, it really looked like some of the people who worked there, either did not have a clue about the real business of Enron or were fooling themselves in believing that some part of their business was actually making money. I do not know if I was the only one, but I clearly remember something was terribly amiss when Jeffrey Skilling had an interview on the Houston Chronicle saying he wanted to retire to spend more time with his family, that was August 15th 2001.

Returning to RIP vs NSP, I'll add on top of Giuseppe's comment that one of the most important step for an engineer to convince herself/himself that compressive sensing might be a good thing for her/him, revolves around trying to fit their known "measurement" device to the CS setting. An example of that was recently featured in TomoPIV meets Compressed Sensing by Stefania Petra and Christoph Schnörr or in the approach I am suggesting when we 'only' have integro-differential equations for which we have some knowledge about computing eigenfunctions fo said operators. Let us note also the fact that NSP can also include a more specific class of signals: positive signals.

I think, at some point, I am also going to have to write down what some of you have been saying to me about the spectral gap of sparse measurement matrices in the RIP-1 setting and its potential relationship to the eigenfunctions of that operator. In particular my interest is in finding out if there is a relationship between the non-normality of a measurement operator and its potential to be a good/bad expander and how this influence their goodness or badness for doing CS with them. All this is highly speculative.

Unrelated to this, two entries by David Brady got my attention recently:
Finally, let us recall that the focusing using random materials and different combinations of an SLM take about 30 minutes to perform according to Ivo Vellekoop's PhD thesis. Could any of the solvers used in CS reconstruction enable a faster focusing ? I bet they would.



While we are on the subject of random lens cameras, I wanted to talk about this a while back, but I have not had the time to do so, so I am leaving it as a food for thought for the week-end. The good folks at MIT (the Photonics Bandgap Fibers and Devices Group under the leadership of Yoel Fink) have devised a camera based on fiber optics. The interesting technology has pieces of metal inside the fiber optic that allows outside objects to shine directly into the light path of the fiber optic so that it is eventually channeled eventually at some common point down the fiber. In effect, this is a multiplexing operation and one could have this fiber optics pipes woven inside apparels to provide some "random" measurement of the surrounding. Since the cloth follows the movement of the body, one can only imagine how the work in signal manifolds would fit in. Some more information on the MIT technology can be found here and in the reference below.

As you all know, I am very much interested in developing a dirt cheap Compressive Sensing application for the purpose of making CS more "obvious" to the average tinkerer/maker. Hence, I wonder how expensive or inexpensive one could fabricate something equivalent to these fibers ? anybody know ? maybe randomly scratching normal fibers to let light come in, that doesn't sound too bad of an idea. The problem being that one wants to avoid breaking these fibers.

Finally, the stunning (in detail) first photo of this entry was taken from the international space station by one of the astronauts. Could we have a different lens mount and perform the equivalent of a random lens imaging experiment with them ? If you are interested we need to talk about an opportunity like this with Nanoracks (you won't see a mention of it on the web yet).


Reference: Abouraddy, A.F., Shapira, O., Bayindir, M., Arnold, J., Sorin, F., Saygin-Hinczewski, D., Joannopoulos, J.D., Fink, Y., "Large-scale optical-field measurements with geometric fibre constructs", Nature Materials 5, 532-536 (2006). [Full Text (pdf)]

Credit:
1 st/2nd Photo: NASA/Bill Ingalls, photo taken from the international space station, 300 km above the scene of interest (the landing of the Soyuz spacecraft in Kazakhstan). Photo taken with one of these instruments on board.
3rd Photo: NASA/JPL/University of Arizona, USGS Dune Database Entry (ESP_014426_2070), Photo taken by the Hirise camera of some of Mars' dunes. (via the Bad Astronomy blog).

Thursday, October 15, 2009

CS: RIP vs NSP, Clarifications by Jared Tanner and Remi Gribonval

Yesterday's post got some nice feedback. Two things:

  • Thank you to Jared Tanner and Remi Gribonval for being kind enough to take some time off to answer yesterday's question.
  • I suck as I have covered some of these issues/papers before but cannot seem to have a good grasp on this issue.
Here are these answers:

Jared Tanner was the first one to respond to yesterday's post:

Dear Igor,

I just read your Oct. 14th posting on Nuit Blanche where the following question was raised by Giuseppe Paleologo:

"do you know how much weaker is the nullspace property vs. restricted isometry? Are there studies on this?"

and your comment that "I am sure a specialist could clear that up for all of us by sending me a short E-mail." Here are a few take away "bullets", followed by a longer discussion.

a) The RIP is a generic tool that can be used for many algorithms, whenever sparsity is the simplicity being exploited.

b) The NSP is only applicable to sparse approximation questions where there is an "objective", such as l1 minimization.

c) For l1 minimization, the NSP is a weaker assumption than the RIP, see "Compressed Sensing: How sharp is the RIP?" by Blanchard, Cartis, and Tanner.

d) l1 minimization is the only setting where NSP and the RIP have been used to try to answer the same question, and is the only setting where it is fair to compare them. NSP is equivalent to l1 minimization recovery, and for this reason RIP implies the NSP, but not the other way around.

e) Both methods can be used to imply stability.

f) Many matrix ensembles are proven to have bounded RIP (Gaussian, Fourier, etc...). Many matrix ensembles are proven to have the NSP, see " Counting the faces of randomly-projected hypercubes and orthants, with applications" by Donoho and Tanner, to appear in Discrete and Computational Geometry.

Now for the longer discussion:

First of all, what is the NSP? Lets focus on the case of \min \|x\|_1 subject to Ax=b where there is an x_0 that is k-sparse with Ax_0=b and A is of size n by N. (Note the ordering k\lt n \lt N.) min l1 recovers x_0 if and only if thereis no vector \nu in the null space of A such that x_0+\nu is interior to (or intersects the boundary of) the l1 ball defined by \|x_0\|_1. (See image above depicting this, with the blue object being the l1 ball, the yellow circle being x_0 and the red line being the null space of A.) The NSP asks if this occurs for any x_0 that is k sparse, effectively moving x_0 to each of the 2^N (N \choose k) k-faces of the l1 ball. That the null space of A shifted to k-faces of the l1 ball never goes interior probably seems a very strict requirement, but it often holds. In fact, this notion isn't new, it is referred to as "neighborliness" in the convex polytope community. David L. Donoho analyzed this question in detail in 2005, precisely characterizing the values of (k,n,N) when this does and does not hold, see "Neighborly Polytopes and Sparse Solutions of Underdetermined Linear Equations."

How does this compare with RIP? First of all, RIP has nothing to do with l1 minimization, but is clearly a natural property to desire in sparse approximation. This is both the strength and weakness of the RIP. It has been used to prove convergence results for many algorithms, including many for which the null space property would give no information. However, because it is a general tool, the RIP based conditions are generally very restrictive. There is only one venue where it is appropriate to directly compare the null space property and the RIP, this is the recovery results for l1 minimization where both can be used. Not surprisingly, the null space results derived by Donoho are much weaker than associated RIP based results.

This raises a very related point, how does one read RIP based results? When is it true that RIP constants of order 2k are less then 0.45? (See Foucart and Lai, ACHA 2009, pages 395-407 for this l1 recovery condition.) How much more restrictive is it to require RIP constants of order 3k to be less than say 0.2? (See Subspace Pursuit by Dai and Milenkovic.) Some answer this question by saying that matrix class zzz has bounded RIP constants so this is satisfied when n \gt O(k\log(N/k)). Unfortunately, this type of answer does not allow one to distinguish between the above two conditions, and essentially says that all state of the art CS algorithms behave at the optimal order. Although true, this might not be helpful to a practitioner who wants to know what is the best algorithm for them, and how large n must be. To shed some light on this, Jeffrey B. Blanchard, Coralia Cartis, and myself derived improved bounds on the RIP constants for the Gaussian ensemble, and showed how these bounds can be used to make more transparent when the above mentioned conditions are satisfied; allowing one to take the values k,n,N and easily verify if the bound would be satisfied. (Note these bounds require large problem sizes to be effective.) Andrew Thompson joined us to repeat this process for many of the greedy algorithms, see "Phase transitions for greedy sparse approximation algorithms" at: http://www.maths.ed.ac.uk/~tanner/publist.html

Lastly, the NSP and RIP prove a "strong equivalence", that the matrix can be used for compressed sensing of any k sparse vector. There is a less restrictive version of the NSP, which proves what Donoho termed "weak equivalence". Weak equivalence ensures that the matrix can be used for compressed sensing for all but a vanishingly small fraction of k sparse vectors. Note, this is what one observes in practice when
selecting a k sparse vector at random and tests the an algorithms performance. (For a discussion of weak equivalence see "Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing" by Donoho and Tanner.) Efforts are underway to derive weak versions of the RIP, see "Construction of a Large Class of Deterministic Sensing Matrices".

Hope that answers a few questions....

Then, Remi Gribonval sent the following:

Dear Igor,

Regarding the RIP vs NSP discussion, I think our recent IEEE Trans. Inf. Theory paper with Mike Davies pretty much shows how sharper the NSP is wrt the RIP. The paper can be found here and our preprint was discussed on a previous post on Nuit Blanche. We also have a conference paper at SAMPTA09 which considers stable versions of the NSP, seemingly identical to the "truncated NSP" of Wotao Yin and Yilun Wang mentioned in a recent post. This paper is available here.

One way to summarize the results is the following
  • Any matrix satisfying the RIP with delta_2m \lt 0.414 (slightly better constants are known) must satisfy the NSP of order m, and L1 minimization therefore recovers all m-sparse vectors.
  • There exists matrices with RIP delta_2m arbitrarily close to 1/sqrt(2) which fail to satisfy the NSP of order m, and for which L1 will therefore fail to recover some m-sparse vectors
  • Yet, there are matrices with RIP arbitrarily close to 1 which satisfy the NSP of order m, and where L1 is therefore successful.
With this respect,
  • The RIP is a sufficient condition to guarantee that the NSP (or, even better, its stable version) is satisfied. The RIP (implicitly, with a small RIP constant) is not necessary to guarantee the recovery of sparse vectors (and the stable approximate recovery of compressible vectors).
  • however the RIP seems necessary to guarantee robustness to noise
I guess that similar results would hold for the RIP-1 property and other sufficient conditions which differ from the NSP which is a necessary and sufficient condition. Another line of thought is that of Jared Tanner and David Donoho, where they consider "weak recovery thresholds" for specific families of random matrices, that is to say they allow an exponentially small fraction of m-sparse vectors that may not be recovered. Then, the NSP no longer characterizes the threshold, but the RIP is even less accurate.

I hope that this contributes to clarifying these questions....

@article{Davies:2008ab,
Author = {Davies, M. E. and Gribonval, R{\'e}mi},
Journal = {IEEE Trans. Inform. Theory},
Month = may,
Number = {5},
Pages = {2203--2214},
Title = {Restricted Isometry Constants where $\ell^p$ sparse recovery can fail for $0 <>
Volume = {55},
Year = {2009}}

@inproceedings{Davies:2009aa,
Address = {Marseille, France},
Author = {Davies, M. E. and Gribonval, R{\'e}mi},
Booktitle = {Proc. SAMPTA'09 (Sampling Theory and Applications)},
Month = {may},
Title = {On Lp minimisation, instance optimality, and restricted isometry constants for sparse approximation},
Year = {2009}}

Thank you Jared and Remi !