Nuit Blanche: CS: Q&A, Noise reduction through CS, reconstruction from undersampled k-t space, L1-minimization methods for Hamilton

Thursday, May 15, 2008

CS: Q&A, Noise reduction through CS, reconstruction from undersampled k-t space, L1-minimization methods for Hamilton–Jacobi equations.

In a previous entry, I mentioned the work ofAlexandre d'Aspremont, Laurent El Ghaoui, Michael I. Jordan, Gert Lanckriet featured in A Direct Formulation for Sparse PCA Using Semidefinite Programming. In one of the slide (27) I put on the blog one can read:

"....Upper bounds for sparse PCA prove deterministically and with polynomial complexity that a finite dimensional matrix satisfies the restricted isometry condition...."

I had not paid real attention to that part of the slide but Jort Gemmeke did and asked the following question in the comment section:

I have trouble understanding the conclusion on the sparse PCA slide: what does it mean that "a finite dimensional matrix" satisfies the RIP property? Could you give an interpretation?"

Alexandre kindly responded offline with:

I mean that sparse PCA produces upper bounds on the restricted isometry constant, hence can potentially prove that a given finite dimensional matrix follows the RIP instead of relying on asymptotic results.

That certainly clears it up. Thanks Alexandre and Jort.

While on their sites, I found the following preprints/talks of relevance to Compressed Sensing:

Jort Gemmeke and Bert Cranen released Noise reduction through Compressed Sensing. The abstract reads:

We present an exemplar-based method for noise reduction using missing data imputation: A noise-corrupted word is sparsely represented in an over-complete basis of exemplar (clean) speech signals using only the uncorrupted time-frequency elements of the word. Prior to recognition the parts of the spectrogram dominated by noise are replaced by clean speech estimates obtained by projecting the sparse representation in the basis. Since at low SNRs individual frames may contain few, if any, uncorrupted coefficients, the method tries to exploit all reliable information that is available in a word-length time window. We study the effectiveness of this approach on the Interspeech 2008 Consonant Challenge (VCV) data as well as on AURORA-2 data. Using oracle masks, we obtain obtain accuracies of 36-44% on the VCV data. On AURORA-2 we obtain an accuracy of 91% at SNR -5 dB, compared to 61% using a conventional frame-based approach, clearly illustrating the great potential of the method.

This is outstanding, here is what I take away from this paper:

Experiments on AURORA-2 using an oracle mask, however, also clearly show the potential of the presented method: A recognition accuracy of 91% at SNR = -5 dB is obtained, an increase of 30% absolute over a state-of the art missing data speech recognizer using frame by frame imputation. This shows that even at very low SNRs enough information about the speech signal may be preserved to successfully perform imputation solely on the basis of reliable time frequency cells provided enough time-context is used.

State dependent oracle masks for improved dynamical features by Jort Gemmeke and Bert Cranen. The abstract reads:

Using the AURORA-2 digit recognition task, we show that recognition accuracies obtained with classical, SNR based oracle masks can be substantially improved by using a state dependent mask estimation technique.

Classification on incomplete data: imputation is optional by Jort Gemmeke. The abstract reads:

We present a non-parametric technique capable of performing classification directly on incomplete data, optionally performing imputation. The technique works by sparsely representing the available data in a basis of example data. Experiments on a spoken digit classification task show significant improvement over a baseline missing-data classifier.

Noise robust digit recognition using sparse representations by Jort Gemmeke and Bert Cranen. The abstract reads:

Despite the use of noise robustness techniques, automatic speech recognition (ASR) systems make many more recognition errors than humans, especially in very noisy circumstances. We argue that this inferior recognition performance is largely due to the fact that in ASR speech is typically processed on a frame by-frame basis preventing the redundancy in the speech signal to be optimally exploited. We present a novel non-parametric classification method that can handle missing data while simultaneously exploiting the dependencies between the reliable features in an entire word. We compare the new method with a state-of-the-art HMM-based speech decoder in which missing data are imputed on a frame-by-frame basis. Both methods are tested on a single digit recognition task (based on AURORA-2 data) using an oracle and an estimated harmonicity mask. We show that at an SNR of -5 dB using the reliable features of an entire word allows an accuracy of 91% (using mel-log-energy features in combination with an oracle mask), while a conventional frame-based approach achieves only 61%. Results obtained with the harmonicity mask suggest that this specific mask estimation technique is simply unable to deliver sufficient reliable features for acceptable recognition rates at these low SNRs.

Accelerating dynamic spiral MRI by algebraic reconstruction from undersampled k-t space by
T. Shin, J.F. Nielsen, Krishna S. Nayak. The abstract reads:

The temporal resolution of dynamic magnetic resonance imaging (MRI) can be increased by sampling a fraction of k-space in an interleaved fashion, which introduces spatial and temporal aliasing. We describe algebraically and graphically the aliasing process caused by dynamic undersampled spiral imaging within 3-D xyf space (the Fourier transform of k_x, k_y, t space) and formulate the unaliasing problem as a set of independent linear inversions. Since each linear system is numerically underdetermined, the use of prior knowledge in the form of bounded support regions is proposed. To overcome the excessive memory requirements for handling large matrices, a fast implementation of the conjugate gradient (CG) method is used. Numerical simulation and in vivo experiments using spiral twofold undersampling demonstrate reduced motion artifacts and the improved depiction of fine cardiac structures. The achieved reduction of motion artifacts and motion blur is comparable to simple filtering, which is computationally more efficient, while the proposed algebraic framework offers greater flexibility to incorporate additional algebraic acceleration techniques and to handle arbitrary sampling schemes.

A new approximation technique based on L1-minimization is introduced. It is proven that the approximate solution converges to the viscosity solution in the case of one-dimensional stationary Hamilton–Jacobi equation with convex Hamiltonian.

Credit: NASA/JPL/Space Science Institute, Daphnis and Prometheus, photo taken by Cassini five weeks ago.