## Tuesday, September 14, 2010

### CS: The long post of the week : Ambulatory ECG, The brain and much mathematics...

This is how I wanted to start my post  today," Today we have mostly a post with a mathematical bent and two of them deal with ridgelets, mmmuhh, I also note that the inclusion of Compressive Sensing as a building block to our comprehension of the brain is building up ...Enjoy!" But then I checked the Twitter, and found out that Pierre Vandergheynst tweeted that he had posted his slide from a presentation in Marseille. So let us start with that and then you can all go for the mathematics:

Pierre Vandergheynst's slides are entitled Compressed Sensing,  Some experimental promises ?. Our french readers should be expecting to see images of slide 22 in La Recherche coming out in October. Aside from the the very nice MRI work mentioned before, I liked the use of the Donoho-Tanner phase transition to evaluate a hardware system.

Some new element of the presentation also included a very interesting trade study on a low-power ECG ambulatory system.

So it looks like this is not optimal in many respect. Yes, you reduce the power level but not with the dramatic effect initially expected. As I have said before, if you design your hardware to be providing exactly the same information as the one you are supposed to displace, then you are really asking to be irrelevant.....fast. The CS encoding not only has a clear advantage over no compression as it is robust to packet losses but more importantly, if Compressive Sensing is bringing one thing to the table, it is really that information is reduced and therefore that smaller classifiers can run some diagnostics on the jailbroken iPhone 3GS, I doubt that a similar classifier can do the job on the uncompressed data without using much needed power resources.

Coming back to the new papers and preprints, let us start with a more hardware oriented paper before diving into more mathematical matters:

Spatially regularized compressed sensing of diffusion MRI data by Oleg Michailovich, Yogesh Rathi. The abstract reads:
The present paper introduces a method for substantial reduction of the number of diffusion encoding gradients required for reliable reconstruction of HARDI signals. The method exploits the theory of compressed sensing (CS), which establishes conditions on which a signal of interest can be recovered from its under-sampled measurements, provided that the signal admits a sparse representation in the domain of a linear transform. In the case at hand, the latter is defined to be spherical ridgelet transformation, which excels in sparsifying HARDI signals. What makes the resulting reconstruction procedure even more accurate is a combination of the sparsity constraints in the diffusion domain with additional constraints imposed on the estimated diffusion field in the spatial domain. Accordingly, the present paper describes a novel way to combine the diffusion- and spatial-domain constraints to achieve a maximal reduction in the number of diffusion measurements, while sacrificing little in terms of reconstruction accuracy. Finally, details are provided on a particularly efficient numerical scheme which can be used to solve the aforementioned reconstruction problem by means of standard and readily available estimation tools. The paper is concluded with experimental results which support the practical value of the proposed reconstruction methodology.

At NIPS the accepted papers list came up but I could not find the actual presentations or papers so I went ahead and looked up the authors and found the following two: Adaptive compressed sensing – a new class of self-organizing coding models for neuroscience by William K. Coulter, Christopher J. Hillar, Guy Isely, Friedrich T. Sommer. The abstract reads:
Sparse coding networks, which utilize unsupervised learning to maximize coding efficiency, have successfully reproduced response properties found in primary visual cortex [1]. However, conventional sparse coding models require that the coding circuit can fully sample the sensory data in a one-to-one fashion, a requirement not supported by experimental data from the thalamo-cortical projection. To relieve these strict wiring requirements, we propose a sparse coding network constructed by introducing synaptic learning in the framework of compressed sensing. We demonstrate that the new model evolves biologically realistic spatially smooth receptive fields despite the fact that the feedforward connectivity subsamples the input and thus the learning has to rely on an impoverished and distorted account of the original visual data. Further, we demonstrate that the model could form a general scheme of cortical communication: it can form meaningful representations in a secondary sensory area, which receives input from the primary sensory area through a “compressing” cortico-cortical projection. Finally, we prove that our model belongs to a new class of sparse coding algorithms in which recurrent
connections are essential in forming the spatial receptive fields.

and Statistical Mechanics of Compressed Sensing by Surya Ganguli, Haim Sompolinsky. The abstract reads:
Compressed sensing (CS) is an important recent advance that shows how to reconstruct sparse high dimensional signals from surprisingly small numbers of random measurements. The nonlinear nature of the reconstruction process poses a challenge to understanding the performance of CS. We employ techniques from the statistical physics of disordered systems to compute the typical behavior of CS as a function of the signal sparsity and measurement density.We find surprising and useful regularities in the nature of errors made by CS, a new phase transition which reveals the possibility of CS for nonnegative signals without optimization, and a new null model for sparse regression.

Also found on Arxiv:
Average best m-term approximation by Jan Vybıral. The abstract reads:
We introduce the concept of average best m-term approximation widths with respect to a probability measure on the unit ball of ℓn p . We estimate these quantities for the embedding id : ℓn p → ℓn q with 0 \lt p \le q \le ∞ for the normalized cone and surface measure. Furthermore, we consider certain tensor product weights and show that a typical vector with respect to such a measure exhibits a strong compressible (i.e. nearly sparse) structure.
Detection boundary in sparse regression by Yuri Ingster. , Alexander Tsybakov, and Nicolas Verzelen.. The abstract reads:
We study the problem of detection of a p-dimensional sparse vector of parameters in the linear regression model with Gaussian noise. We establish the detection boundary, i.e., the necessary and sufficient conditions for the possibility of successful detection as both the sample size n and the dimension p tend to the infinity. Testing procedures that achieve this boundary are also exhibited. Our results encompass the high-dimensional setting (p \gt\gt n). The main message is that, under some conditions, the detection boundary phenomenon that has been proved for the Gaussian sequence model, extends to high-dimensional linear regression. Finally, we establish the detection boundaries when the variance of the noise is unknown. Interestingly, the detection boundaries sometimes depend on the knowledge of the variance in a high-dimensional setting.
Consider a real diagonal deterministic matrix $X_n$ of size $n$ with spectral measure converging to a compactly supported probability measure. We perturb this matrix by adding a random finite rank matrix of a certain form, with delocalized eigenvectors. We show that the joint law of the extreme eigenvalues of the perturbed model satisfies a large deviation principle, in the scale n with a good rate function given by a variational formula. We tackle both cases when the extreme eigenvalues of X_n converge to the edges of the support of the limiting measure and when we allow some eigenvalues of X_n, that we call outliers, to converge out of the bulk. We can also generalise our results to the case when X_n is random, with law proportional to e^{- Trace V(X)}d X, for V growing fast enough at infinity and any perturbation of finite rank with orthonormal eigenvectors.

Consider a deterministic self-adjoint matrix X_n with spectral measure converging to a compactly supported probability measure, the largest and smallest eigenvalues converging to the edges of the limiting measure. We perturb this matrix by adding a random finite rank matrix with delocalized eigenvectors and study the extreme eigenvalues of the deformed model. We show that the eigenvalues converging out of the bulk exhibit Gaussian fluctuations, whereas under additional hypotheses, the eigenvalues sticking to the edges are very close to the eigenvalues of the non-perturbed model and fluctuate in the same scale. We can also generalise these results to the case when X_n is random and get similar behaviour when we deform some classical models such as Wigner or Wishart matrices with rather general entries or the so-called matrix models.

Learning Functions of Few Arbitrary Linear Parameters in High Dimensions by Massimo Fornasier, Karin Schnass, Jan Vybíral. The abstract reads:
Let us assume that $f$ is a continuous function defined on the unit ball of $\mathbb R^d$, of the form $f(x) = g (A x)$, where $A$ is a $k \times d$ matrix and $g$ is a function of $k$ variables for $k \ll d$. We are given a budget $m \in \mathbb N$ of possible point evaluations $f(x_i)$, $i=1,...,m$, of $f$, which we are allowed to query in order to construct a uniform approximating function. Under certain smoothness and variation assumptions on the function $g$, and an {\it arbitrary} choice of the matrix $A$, we present in this paper 1. a sampling choice of the points $\{x_i\}$ drawn at random for each function approximation; 2. an algorithm for computing the approximating function, whose complexity is at most polynomial in the dimension $d$ and in the number $m$ of points. Due to the arbitrariness of $A$, the choice of the sampling points will be according to suitable random distributions and our results hold with overwhelming probability. Our approach uses tools taken from the {\it compressed sensing} framework, recent Chernoff bounds for sums of positive-semidefinite matrices, and classical stability bounds for invariant subspaces of singular value decompositions.
A variant of the Johnson-Lindenstrauss lemma for circulant matrices by Jan Vybíral. The abstract reads:
We continue our study of the Johnson-Lindenstrauss lemma and its connection to circulant matrices started in \cite{HV}. We reduce the bound on $k$ from $k=O(\epsilon^{-2}\log^3n)$ proven there to $k=O(\epsilon^{-2}\log^2n)$. Our technique differs essentially from the one used in \cite{HV}. We employ the discrete Fourier transform and singular value decomposition to deal with the dependency caused by the circulant structure.
An attendant presentation on the subject is here.

Finally we have: A Hierarchical Bayesian Framework for Constructing Sparsity-inducing Priors by Anthony Lee, Francois Caron, Arnaud Doucet, Chris Holmes. The abstract reads:

Variable selection techniques have become increasingly popular amongst statisticians due to an increased number of regression and classification applications involving high-dimensional data where we expect some predictors to be unimportant. In this context, Bayesian variable selection techniques involving Markov chain Monte Carlo exploration of the posterior distribution over models can be prohibitively computationally expensive and so there has been attention paid to quasi-Bayesian approaches such as maximum a posteriori (MAP) estimation using priors that induce sparsity in such estimates. We focus on this latter approach, expanding on the hierarchies proposed to date to provide a Bayesian interpretation and generalization of state-of-the-art penalized optimization approaches and providing simultaneously a natural way to include prior information about parameters within this framework. We give examples of how to use this hierarchy to compute MAP estimates for linear and logistic regression as well as sparse precision-matrix estimates in Gaussian graphical models. In addition, an adaptive group lasso method is derived using the framework

A new version of Compressed Genotyping is available.

Image Credit: NASA/JPL/Space Science Institute
N00163047.jpg was taken on September 10, 2010 and received on Earth September 11, 2010. The camera was pointing toward JANUS, and the image was taken using the CL1 and CL2 filters.