Friday, November 18, 2011

Compressive Sensing Literature This Week

Last time, I forgot to link to Bob's amazing blog entry, here it is Phase transitions in higher dimension. It shows for the first time that Jeremy Vila and Phil Schniter's EMBGAMP solver beats out SL0 which was the best reconstruction algorithm in the noiseless case until now...

Danny visited Phil Schniter and eventually transformed one of his AMP algorithm for GraphLab (a competitor to Hadoop). Looks like we are beginning to get some traction from parallel computing. woohoo.

I just noticed a presentation by Marc Mezard on Statistical physics approach to compressed sensing. This is the presentation I saw a week ago at a seminar and should provide some better background on the important results found in that paper ( read this short summary for some context )..

This past week, I mentioned to Thong Do that one of his software package for structurally random matrices was not available anymore, here is his response:
Dear Igor,

Thank you very much for pointing this out. Our server has recently been crashed. We will put the SRM package back to our server soon. At the moment, people can use this alternative link on my personal website to download the software package:https://sites.google.com/site/thongdojhu/fast_cs_SRM.rar

By the way, we have recently released a new manuscript on Arxiv (http://arxiv.org/abs/1111.1947): Discriminative Local Sparse Representations for Robust Face Recognition. Here's its abstract:

"A key recent advance in face recognition models a test face image as a sparse linear combination of a set of training face images. The resulting sparse representations have been shown to possess robustness against a variety of distortions like random pixel corruption, occlusion and disguise. This approach however makes the restrictive (in many scenarios) assumption that test faces must be perfectly aligned (or registered) to the training data prior to classification. In this paper, we propose a simple yet robust local block-based sparsity model, using adaptively-constructed dictionaries from local features in the training data, to overcome this misalignment problem. Our approach is inspired by human perception: we analyze a series of local discriminative features and combine them to arrive at the final classification decision. We propose a probabilistic graphical model framework to explicitly mine the conditional dependencies between these distinct sparse local features. In particular, we learn discriminative graphs on sparse representations obtained from distinct local slices of a face. Conditional correlations between these sparse features are first discovered (in the training phase), and subsequently exploited to bring about significant improvements in recognition rates. Experimental results obtained on benchmark face databases demonstrate the effectiveness of the proposed algorithms in the presence of multiple registration errors (such as translation, rotation, and scaling) as well as under variations of pose and illumination."

Notice that this manuscript is the journal version of our earlier ICIP 2010 conference paper: Robust face recognition using locally adaptive sparse representation. The manuscript has been submitted to Trans. on Image Processing.
Best regards,
Thong.
Thanks Thong.

In other news, it is called acquisition comprimee in French and it is on wikipedia (the definition needs some changes as people seem to continue on being conflicted with the reconstruction aspect of compressive sensing and the signal acquisition part).

In the calibration world (I will soon open a page on the topic) we have two interesting papers / presentations:

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach  by Jason T. Parker, Philip SchniterVolkan Cevher. I look forward to an implementation of MU-GAMP. Let us note an implementation of a parametric MU-GAMP for structured measurement matrices

For some it's a calibration step, for others it's crypto-attack, but they deal with the exact same subject: how do you figure out the measurement matrix from measurement and training data:

In the following paper, the author essentially performs the calibration problem on some structured measurement matrices: Toeplitz and random Fourier. On Discovering the Compressive Sensing Matrix From Few Signal/Measurement Pairs by Hyrum S. Anderson. The abstract reads:
Recently, it has been shown that a randomly generated compressive sensing (CS) measurement matrix serves as a long cryptographic key that provides a degree of secrecy: an eavesdropper that inspects only the compressed samples can learn almost nothing about the encoded sparse vector. In practical CS applications, however, the measurement matrix is often structured in order to simplify CS reconstruction, and this structure effectively reduces the length of the cryptographic key. This work addresses the scenario in which the eavesdropper observes a few signal/measurement pairs, and investigates how many pairs must be observed in order to exactly recover the CS measurement matrix. Two matrix classes are studied that are commonly employed: Toeplitz, and random Fourier. Experiments demonstrate recovery from few input/output pairs.

LEARNING RIDGE FUNCTIONS WITH RANDOMIZED SAMPLING IN HIGH DIMENSIONS by Hemant Tyagi and Volkan Cevher. The abstract reads:
We study the problem of learning ridge functions of the form f(x) = g(aTx), x 2 Rd, from random samples. Assuming g to be a twice continuously differentiable function, we leverage techniques from low rank matrix recovery literature to derive a uniform approximation guarantee for estimation of the ridge function f. Our new analysis removes the de facto compressibility assumption on the parameter a for learning in the existing literature. Interestingly, the price one has to pay in high dimensional settings is not much: the sampling complexity changes from O(log d) to O(d) or fromO(d2log d) to O(d4), depending on the smoothness properties of g at the origin.

Recipes on Hard Thresholding Methods by Anastasios Kyrillidis and Volkan Cevher. The abstract reads:
We present and analyze a new set of sparse recovery algorithms within the class of hard thresholding methods. We provide optimal strategies on how to set up these algorithms via basic “ingredients” for different conﬁgurations to achieve complexity vs. accuracy tradeoffs. Simulation results demonstrate notable performance improvements compared to state-of-the-art algorithms both in terms of data reconstruction and computational complexity.

EQUIVALENCE OF SYNTHESIS AND ATOMIC FORMULATIONS OF SPARSE RECOVERY by Mitra Fatemi, Shayan Dashmiz, Mohammad Hossein Shaﬁnia, and Volkan Cevher. The abstract reads:
Recovery of sparse signals from linear, dimensionality reducing measurements broadly fall under two well-known formulations, named the synthesis and the analysis a la Elad et al. Recently, ´ Chandrasekaran et al. introduced a new algorithmic sparse recovery framework based on the convex geometry of linear inverse problems, called the atomic norm formulation. In this paper, we prove that atomic norm formulation and synthesis formulation are equivalent for closed atomic sets. Hence, it is possible to use the synthesis formulation in order to obtain the so-called atomic decompositions of signals. In order to numerically observe this equivalence we derive exact linear matrix inequality representations, also known as the theta bodies, of the centrosymmertic polytopes formed from the columns of the simplex and their antipodes. We then illustrate that the atomic and synthesis recovery results agree on machine precision on randomly generated sparse recovery problems.

We study the sparsity of spectro-temporal representation of speech in reverberant acoustic conditions. This study motivates the use of structured sparsity models for efﬁcient speech recovery. We formulate the underdetermined convolutive speech separation in spectro-temporal domain as the sparse signal recovery where we leverage model-based recovery algorithms. To tackle the ambiguity of the real acoustics, we exploit the Image Model of the enclosures to estimate the room impulse response function through a structured sparsity constraint optimization. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech applications.
Here is technical report on the same subject.

In arxiv, here is what I found on top of the matrix factorization/compressive papers mentioned earlier this week:

This paper investigates analysis operator learning for the recently introduced cosparse signal model that is a natural analysis complement to the more traditional sparse signal model. Previous work on such analysis operator learning has relied on access to a set of clean training samples. Here we introduce a new learning framework which can use training data which is corrupted by noise and/or is only approximately cosparse. The new model assumes that a p-cosparse signal exists in an epsilon neighborhood of each data point. The operator is assumed to be uniformly normalized tight frame (UNTF) to exclude some trivial operators. In this setting, a bi-level optimization algorithm is introduced to learn a suitable analysis operator

Accuracy guaranties for $\ell_1$ recovery of block-sparse signals by Anatoli Juditsky, Fatma Kılınc Karzan, Arkadi Nemirovski, Boris Polyak. The absteact reads:
We discuss new methods for the recovery of signals with block-sparsestructure, based on $\ell_1$-minimization. Our emphasis is on verifiable conditions on the problem parameters (sensing matrix and the block structure) for accurate recovery and efficiently computable bounds for the recovery error. These bounds are then optimized with respect to the method parameters to construct the estimators with improved statistical properties. To justify the proposed approach we provide an oracle inequality which links the properties of the recovery algorithms and the best estimation performance. We also propose a new matching pursuit algorithm for block-sparse recovery.

Fingerprinting with Equiangular Tight Frames by Dustin G. Mixon, Christopher J. Quinn, Negar Kiyavash, Matthew Fickus. The abstract reads:
Digital fingerprinting is a framework for marking media files, such as images, music, or movies, with user-specific signatures to deter illegal distribution. Multiple users can collude to produce a forgery that can potentially overcome a fingerprinting system. This paper proposes an equiangular tight frame fingerprint design which is robust to such collusion attacks. We motivate this design by considering digital fingerprinting in terms of compressed sensing. The attack is modeled as linear averaging of multiple marked copies before adding a Gaussian noise vector. The content owner can then determine guilt by exploiting correlation between each user's fingerprint and the forged copy. The worst-case error probability of this detection scheme is analyzed and bounded. Simulation results demonstrate the average-case performance is similar to the performance of orthogonal and simplex fingerprint designs, while accommodating several times as many users.
wuth the attendant related presentation and video of the talk.

In this paper we propose a novel form of clipping mitigation in OFDM using compressive sensing that completely avoids tone reservation and hence rate loss for this purpose. The method builds on selecting the most reliable perturbations from the constellation lattice upon decoding at the receiver, and performs compressive sensing over these observations in order to completely recover the temporally sparse nonlinear distortion. As such, the method provides a unique practical solution to the problem of initial erroneous decoding decisions in iterative ML methods, offering both the ability to augment these techniques and to solely recover the distorted signal in one shot.

Weighted eigenfunction estimates with applications to compressed sensing by Nicolas Burq, Semyon Dyatlov, Rachel Ward, Maciej Zworski. The abstract reads:
Using tools from semiclassical analysis, we give weighted L^\infty estimates for eigenfunctions of strictly convex surfaces of revolution. These estimates give rise to new sampling techniques and provide improved bounds on the number of samples necessary for recovering sparse eigenfunction expansions on surfaces of revolution. On the sphere, our estimates imply that any function having an s-sparse expansion in the first N spherical harmonics can be efficiently recovered from its values at m > s N^(1/6) log^4(N) sampling points.

Improved Bound for the Nystrom's Method and its Application to Kernel Classification by Rong Jin, Tianbao Yang, Mehrdad Mahdavi. The abstract reads:
We develop three approaches for analyzing the approximation bound for the Nystrom method, one based on the matrix perturbation theory, one based on the concentration inequality of integral operator, and one based on the incoherence measure introduced in compressive sensing. The new analysis improves the approximation error of the Nystrom method from $O(m^{-1/4})$ to $O(m^{-1/2})$, and further to $O(m^{-p})$ if the eigenvalues of the kernel matrix follow a $p$-power law, which explains why the Nystrom method works very well for kernel matrix with skewed eigenvalues. We develop a kernel classification approach based on the Nystrom method and derive its generalized performance. We show that when the eigenvalues of kernel matrix follow a $p$-power law, we can reduce the number of support vectors to $O(N^{2/(p+1)})$ without seriously sacrificing its generalized performance.

In this paper, we improve iterative greedy search algorithms in which atoms are selected serially over iterations, i.e., one-by-one over iterations. For serial atom selection, we devise two new schemes to select an atom from a set of potential atoms in each iteration. The two new schemes lead to two new algorithms. For both the algorithms, in each iteration, the set of potential atoms is found using a standard matched filter. In case of the first scheme, we propose an orthogonal projection strategy that selects an atom from the set of potential atoms. Then, for the second scheme, we propose a look ahead strategy such that the selection of an atom in the current iteration has an effect on the future iterations. The use of look ahead strategy requires a higher computational resource. To achieve a trade-off between performance and complexity, we use the two new schemes in cascade and develop a third new algorithm. Through experimental evaluations, we compare the proposed algorithms with existing greedy search and convex relaxation algorithms.

It is well known that $\ell_1$ minimization can be used to recover sufficiently sparse unknown signals from compressed linear measurements. In fact, exact thresholds on the sparsity, as a function of the ratio between the system dimensions, so that with high probability almost all sparse signals can be recovered from i.i.d. Gaussian measurements, have been computed and are referred to as "weak thresholds" \cite{D}. In this paper, we introduce a reweighted $\ell_1$ recovery algorithm composed of two steps: a standard $\ell_1$ minimization step to identify a set of entries where the signal is likely to reside, and a weighted $\ell_1$ minimization step where entries outside this set are penalized. For signals where the non-sparse component entries are independent and identically drawn from certain classes of distributions, (including most well known continuous distributions), we prove a \emph{strict} improvement in the weak recovery threshold. Our analysis suggests that the level of improvement in the weak threshold depends on the behavior of the distribution at the origin. Numerical simulations verify the distribution dependence of the threshold improvement very well, and suggest that in the case of i.i.d. Gaussian nonzero entries, the improvement can be quite impressive---over 20% in the example we consider.

In this paper, we study the degrees of freedom (df) of penalized l1 minimization (also known as the Lasso) for linear regression models. We give a closed-form expression of the degrees of freedom of the Lasso response. Namely, we show that for any given Lasso regularization parameter \lambda and any observed data y belongs to a set of full measure, the cardinal of the support of a particular solution of the Lasso problem is an unbiased estimator of the degrees of freedom of the Lasso response. This is achieved without any assumption on the uniqueness of the Lasso solution. Thus, our result remains true for both the underdetermined and the overdetermined case studied originally in Zou et al.. We also prove that a key result in Zou et al. is not true by providing a simple counterexample. An effective estimator of the number of degrees of freedom may have several applications including an objectively guided choice of the regularization parameter in the Lasso through the SURE framework.

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.