## Tuesday, July 31, 2012

### Compressive Sensing and Advanced Matrix Factorization This Month

This month I spotted quite a few papers, conference and presentations. Here they are are. There is no particular order except for featuring the application / hardware papers first.

First we have a conference:

Then some presentations

And the more applied papers starting with this Press release on Novel Compressed Sensing Reconstruction Developed for Photoacoustic Tomography in Vivo featuring this publication:

Compressed sensing (CS) can recover sparse signals from under-sampled measurements. In this work, we have developed an advanced CS framework for photoacoustic computed tomography (PACT). During the reconstruction, a small part of the nonzero signals’ locations in the transformed sparse domain is used as partially known support (PKS). PACT reconstructions have been performed with under-sampled in vivo image data of human hands and a rat. Compared to PACT with basic CS, PACT with CS-PKS can recover signals using fewer ultrasonic transducer elements and can improve convergence speed, which may ultimately enable high-speed, low-cost PACT for various biomedical applications.

Cell tracking using perfluorocarbon labels and fluorine-19 ((19) F) MRI is a noninvasive approach to visualize and quantify cell populations in vivo. In this study, we investigated three-dimensional compressed sensing methods to accelerate (19) F MRI data acquisition for cell tracking and evaluate the impact of acceleration on (19) F signal quantification. We show that a greater than 8-fold reduction in imaging time was feasible without pronounced image degradation and with minimal impact on the image signal-to-noise ratio and (19) F quantification accuracy. In (19) F phantom studies, we show that apparent feature topology is maintained with compressed sensing reconstruction, and false positive signals do not appear in areas devoid of fluorine. We apply the three-dimensional compressed sensing (19) F MRI methods to quantify the macrophage burden in a localized wounding-inflammation mouse model in vivo; at 8-fold image acceleration, the (19) F signal distribution was accurately reproduced, with no loss in signal-to-noise ratio. Our results demonstrate that three-dimensional compressed sensing methods have potential for advancing in vivo (19) F cell tracking for a wide range of preclinical and translational applications

We propose a novel method called compressed sensing with linear-in-wavenumber sampling (k-linear CS) to retrieve an image for spectral-domain optical coherence tomography (SD-OCT). An array of points that is evenly spaced in wavenumber domain is sampled from an original interferogram by a preset k-linear mask. Then the compressed sensing based on l1 norm minimization is applied on these points to reconstruct an A-scan data. To get an OCT image, this method uses less than 20% of the total data as required in the typical process and gets rid of the spectral calibration with numerical interpolation in traditional CS-OCT. Therefore k-linear CS is favorable for high speed imaging. It is demonstrated that the k-linear CS has the same axial resolution performance with ∼30  dB higher signal-to-noise ratio (SNR) as compared with the numerical interpolation. Imaging of bio-tissue by SD-OCT with k-linear CS is also demonstrated.

PURPOSE:
To apply compressed sensing (CS) to in vivo multispectral imaging (MSI), which uses additional encoding to avoid magnetic resonance imaging (MRI) artifacts near metal, and demonstrate the feasibility of CS-MSI in postoperative spinal imaging.
MATERIALS AND METHODS:
Thirteen subjects referred for spinal MRI were examined using T2-weighted MSI. A CS undersampling factor was first determined using a structural similarity index as a metric for image quality. Next, these fully sampled datasets were retrospectively undersampled using a variable-density random sampling scheme and reconstructed using an iterative soft-thresholding method. The fully and undersampled images were compared using a 5-point scale. Prospectively undersampled CS-MSI data were also acquired from two subjects to ensure that the prospective random sampling did not affect the image quality.
RESULTS:
A two-fold outer reduction factor was deemed feasible for the spinal datasets. CS-MSI images were shown to be equivalent or better than the original MSI images in all categories: nerve visualization: P = 0.00018; image artifact: P = 0.00031; image quality: P = 0.0030. No alteration of image quality and T2 contrast was observed from prospectively undersampled CS-MSI.
CONCLUSION:
This study shows that the inherently sparse nature of MSI data allows modest undersampling followed by CS reconstruction with no loss of diagnostic quality.

The computational ghost imaging with a phase spatial light modulator (SLM) for wave field coding is considered. A transmission-mask amplitude object is reconstructed from multiple intensity observations. Compressive techniques are used in order to gain a successful image reconstruction with a number of observations (measurement experiments), which is smaller than the image size. Maximum likelihood style algorithms are developed, respectively, for Poissonian and approximate Gaussian modeling of random observations. A sparse and overcomplete modeling of the object enables the advanced high accuracy and sharp imaging. Numerical experiments demonstrate that an approximative Gaussian distribution with an invariant variance results in the algorithm that is efficient for Poissonian observations.

Compressed sensing reconstruction of undersampled 3D NOESY spectra: application to large membrane proteins by Bostock MJ, Holland DJ, Nietlispach D. The abstract reads:
Central to structural studies of biomolecules are multidimensional experiments. These are lengthy to record due to the requirement to sample the full Nyquist grid. Time savings can be achieved through undersampling the indirectly-detected dimensions combined with non-Fourier Transform (FT) processing, provided the experimental signal-to-noise ratio is sufficient. Alternatively, resolution and signal-to-noise can be improved within a given experiment time. However, non-FT based reconstruction of undersampled spectra that encompass a wide signal dynamic range is strongly impeded by the non-linear behaviour of many methods, which further compromises the detection of weak peaks. Here we show, through an application to a larger α-helical membrane protein under crowded spectral conditions, the potential use of compressed sensing (CS) l (1)-norm minimization to reconstruct undersampled 3D NOESY spectra. Substantial signal overlap and low sensitivity make this a demanding application, which strongly benefits from the improvements in signal-to-noise and resolution per unit time achieved through the undersampling approach. The quality of the reconstructions is assessed under varying conditions. We show that the CS approach is robust to noise and, despite significant spectral overlap, is able to reconstruct high quality spectra from data sets recorded in far less than half the amount of time required for regular sampling.

Ultrafast magnetic resonance imaging, employing spiral reciprocal space sampling and compressed sensing image reconstruction, is used to acquire velocity maps of the liquid phase in gas-liquid multiphase flows. Velocity maps were acquired at a rate of 188 frames per second. The method enables quantitative characterization of the wake dynamics of single bubbles and bubble swarms. To illustrate this, we use the new technique to demonstrate the role of bubble wake vorticity in driving bubble secondary motions, and in governing the structure of turbulence in multiphase flows.

A Common Network Architecture Efﬁciently Implements a Variety of Sparsity-based Inference Problems by
Adam S. Charles, Pierre Garrigues, Christopher J. Rozell. The abstract reads:

The sparse coding hypothesis has generated signiﬁcant interest in the computational and theoretical neuroscience communities, but there remain open questions about the exact quantitative form of the sparsity penalty and the implementation of such a coding rule in neurally plausible architectures. The main contribution of this work is to show that a wide variety of sparsity-based probabilistic inference problems proposed in the signal processing and statistics literatures can be implemented exactly in the common network architecture known as the Locally Competitive Algorithm (LCA). Among the cost functions we examine are approximate p norms (0 p 2), modiﬁed p-norms, block-1 norms, and re-weighted algorithms. Of particular interest is that we show signiﬁcantly increased performance in re-weighted 1 algorithms by inferring all parameters jointly in a dynamical system rather than using an iterative approach native to digital computational architectures.

A Compressed Sensing Parameter Extraction Platform for Radar Pulse Signal Acquisition by Juhwan Yoo, Christopher Turnes, Eric Nakamura, Chi Le, Stephen Becker, Emilio Sovero, Michael Wakin, Michael Grant, Justin Romberg, Azita Emami-Neyestanak, and Emmanuel Candes. The abstract reads:
In this paper we present a complete (hardware/software) sub-Nyquist rate ( 13) wideband signal acquisition chain capable of acquiring radar pulse parameters in an instantaneous bandwidth spanning 100 MHz–2:5 GHz with the equivalent of 8 ENOB digitizing performance. The approach is based on the alternative sensing-paradigm of CompressedSensing (CS). The hardware platform features a fully-integrated CS receiver architecture named the random-modulation preintegrator (RMPI) fabricated in Northrop Grumman’s 450 nm InP HBT bipolar technology. The software back-end consists of a novel CS parameter recovery algorithm which extracts information about the signal without performing full timedomain signal reconstruction. This approach signiﬁcantly reduces the computational overhead involved in retrieving desired information which demonstrates an avenue toward employing CS techniques in power-constrained real-time applications. The developed techniques are validated on CS samples physically measured by the fabricated RMPI and measurement results are presented. The parameter estimation algorithms are described in detail and a complete description of the physical hardware is given.

From the Arxiv pipeline:

# PETRELS: Parallel Estimation and Tracking of Subspace by Recursive Least Squares from Partial Observations

Many real world data sets exhibit an embedding of low-dimensional structure in a high-dimensional manifold. Examples include images, videos and internet traffic data. It is of great significance to reduce the storage requirements and computational complexity when the data dimension is high. Therefore we consider the problem of reconstructing a data stream from a small subset of its entries, where the data is assumed to lie in a low-dimensional linear subspace, possibly corrupted by noise. We further consider tracking the change of the underlying subspace, which can be applied to applications such as video denoising, network monitoring and anomaly detection. Our problem can be viewed as a sequential low-rank matrix completion problem in which the subspace is learned in an on-line fashion. The proposed algorithm, dubbed Parallel Estimation and Tracking by REcursive Least Squares (PETRELS), first identifies the underlying low-dimensional subspace via a recursive procedure for each row of the subspace matrix in parallel with discounting for previous observations, and then reconstructs the missing entries via least-squares estimation if required. Numerical examples are provided for direction-of-arrival estimation and matrix completion, comparing PETRELS with state of the art batch algorithms.

# Compressed Sensing off the Grid

We consider the problem of estimating the frequency components of a mixture of $s$ complex sinusoids from a random subset of $n$ regularly spaced samples. Unlike previous work in compressed sensing, the frequencies are not assumed to lie on a grid, but can assume any values in the normalized frequency domain [0,1]. We propose an atomic norm minimization approach to exactly recover the unobserved samples. We reformulate this atomic norm minimization as an exact semidefinite program. Even with this continuous dictionary, we show that most sampling sets of size O(s\log s \log n) are sufficient to guarantee the exact frequency estimation with high probability, provided the frequencies are well separated. Extensive numerical experiments are performed to illustrate the effectiveness of the proposed method.

# Optimal Sampling Points in Reproducing Kernel Hilbert Spaces

The recent developments of basis pursuit and compressed sensing seek to extract information from as few samples as possible. In such applications, since the number of samples is restricted, one should deploy the sampling points wisely. We are motivated to study the optimal distribution of finite sampling points. Formulation under the framework of optimal reconstruction yields a minimization problem. In the discrete case, we estimate the distance between the optimal subspace resulting from a general Karhunen-Loeve transform and the kernel space to obtain another algorithm that is computationally favorable. Numerical experiments are then presented to illustrate the performance of the algorithms for the searching of optimal sampling points.

# Nearfield Acoustic Holography using sparsity and compressive sampling principles

Gilles Chardon (LOA), Laurent Daudet (LOA), Antoine Peillot (IJLRA), François Ollivier (IJLRA), Nancy Bertin (INRIA - IRISA), Rémi Gribonval (INRIA - IRISA)
Regularization of the inverse problem is a complex issue when using Near-field Acoustic Holography (NAH) techniques to identify the vibrating sources. This paper shows that, for convex homogeneous plates with arbitrary boundary conditions, new regularization schemes can be developed, based on the sparsity of the normal velocity of the plate in a well-designed basis, i.e. the possibility to approximate it as a weighted sum of few elementary basis functions. In particular, these new techniques can handle discontinuities of the velocity field at the boundaries, which can be problematic with standard techniques. This comes at the cost of a higher computational complexity to solve the associated optimization problem, though it remains easily tractable with out-of-the-box software. Furthermore, this sparsity framework allows us to take advantage of the concept of Compressive Sampling: under some conditions on the sampling process (here, the design of a random array, which can be numerically and experimentally validated), it is possible to reconstruct the sparse signals with significantly less measurements (i.e., microphones) than classically required. After introducing the different concepts, this paper presents numerical and experimental results of NAH with two plate geometries, and compares the advantages and limitations of these sparsity-based techniques over standard Tikhonov regularization.

# Bounds of restricted isometry constants in extreme asymptotics: formulae for Gaussian matrices

Restricted Isometry Constants (RICs) provide a measure of how far from an isometry a matrix can be when acting on sparse vectors. This, and related quantities, provide a mechanism by which standard eigen-analysis can be applied to topics relying on sparsity. RIC bounds have been presented for a variety of random matrices and matrix dimension and sparsity ranges. We provide explicitly formulae for RIC bounds, of n by N Gaussian matrices with sparsity k, in three settings: a) n/N fixed and k/n approaching zero, b) k/n fixed and n/N approaching zero, and c) n/N approaching zero with k/n decaying inverse logrithmically in N/n; in these three settings the RICs a) decay to zero, b) become unbounded (or approach inherent bounds), and c) approach a non-zero constant. Implications of these results for RIC based analysis of compressed sensing algorithms are presented.

# Nonmonotone Barzilai-Borwein Gradient Algorithm for $\ell_1$-Regularized Nonsmooth Minimization in Compressive Sensing

This paper is devoted to minimizing the sum of a smooth function and a nonsmooth $\ell_1$-regularized term. This problem as a special cases includes the $\ell_1$-regularized convex minimization problem in signal processing, compressive sensing, machine learning, data mining, etc. However, the non-differentiability of the $\ell_1$-norm causes more challenging especially in large problems encountered in many practical applications. This paper proposes, analyzes, and tests a Barzilai-Borwein gradient algorithm. At each iteration, the generated search direction enjoys descent property and can be easily derived by minimizing a local approximal quadratic model and simultaneously taking the favorable structure of the $\ell_1$-norm. Moreover, a nonmonotone line search technique is incorporated to find a suitable stepsize along this direction. The algorithm is easily performed, where the values of the objective function and the gradient of the smooth term are required at per-iteration. Under some conditions, the proposed algorithm is shown to be globally convergent. The limited experiments by using some nonconvex unconstrained problems from CUTEr library with additive $\ell_1$-regularization illustrate that the proposed algorithm performs quite well. Extensive experiments for $\ell_1$-regularized least squares problems in compressive sensing verify that our algorithm compares favorably with several state-of-the-art algorithms which are specifically designed in recent years.

# Approximate Message Passing with Consistent Parameter Estimation and Applications to Sparse Learning

We consider the estimation of an i.i.d. vector $\xbf \in \R^n$ from measurements $\ybf \in \R^m$ obtained by a general cascade model consisting of a known linear transform followed by a probabilistic componentwise (possibly nonlinear) measurement channel. A novel method, called adaptive generalized approximate message passing (Adaptive GAMP), that enables joint learning of the statistics of the prior and measurement channel along with estimation of the unknown vector $\xbf$ is presented. The proposed algorithm is a generalization of a recently-developed technique by Vila and Schniter that uses expectation-maximization (EM) iterations where the posteriors in the E-steps are computed via approximate message passing. The proposed methodology can be applied to a large class of learning problems including the learning of sparse priors in compressed sensing or identification of linear-nonlinear cascade models in dynamical systems and neural spiking processes. We prove that for large i.i.d. Gaussian transform matrices the asymptotic componentwise behavior of the adaptive GAMP algorithm is predicted by a simple set of scalar state evolution equations. This analysis shows that the adaptive GAMP method can yield asymptotically consistent parameter estimates, which implies that the algorithm achieves a reconstruction quality equivalent to the oracle algorithm that knows the correct parameter values. The adaptive GAMP methodology thus provides a systematic, general and computationally efficient method applicable to a large range of complex linear-nonlinear models with provable guarantees.

# Structure-Based Bayesian Sparse Reconstruction

Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical information (Gaussian or otherwise) to obtain near optimal estimates. In addition, we make use of the rich structure of the sensing matrix encountered in many signal processing applications to develop a fast sparse recovery algorithm. The computational complexity of the proposed algorithm is relatively low compared with the widely used convex relaxation methods as well as greedy matching pursuit techniques, especially at a low sparsity rate.

# Compressed sensing for multidimensional electronic spectroscopy experiments

Compressed sensing is a processing method that significantly reduces the number of measurements needed to accurately resolve signals in many fields of science and engineering. We develop a two-dimensional (2D) variant of compressed sensing for multidimensional electronic spectroscopy and apply it to experimental data. For the model system of atomic rubidium vapor, we find that compressed sensing provides significantly better resolution of 2D spectra than a conventional discrete Fourier transform from the same experimental data. We believe that by combining powerful resolution with ease of use, compressed sensing can be a powerful tool for the analysis and interpretation of ultrafast spectroscopy data.

# Color Constancy based on Image Similarity via Bilayer Sparse Coding

Computational color constancy is a very important topic in computer vision and has attracted many researchers' attention. Recently, lots of research has shown the effects of high level visual content information for illumination estimation. However, all of these existing methods are essentially combinational strategies in which image's content analysis is only used to guide the combination or selection from a variety of individual illumination estimation methods. In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously. For the purpose, the image's scene content information is integrated with its color distribution to obtain optimal illumination estimation model. The experimental results on two real-world image sets show that our algorithm is superior to other prevailing illumination estimation methods, even better than combinational methods.

# Vanishingly Sparse Matrices and Expander Graphs, with application to compressed sensing

We consider a probabilistic construction of sparse random matrices where each column has a fixed number of nonzeros whose row indices are drawn uniformly at random. These matrices have a one-to-one correspondence with the adjacency matrices of fixed left degree expander graphs. We present formulae for the expected cardinality of the set of neighbours for these graphs, and present tail bounds on the probability that this cardinality will be less than the expected value. Deducible from these bounds are similar bounds for the expansion of the graph which is of interest in many applications. These bounds are derived through a more detailed analysis of collisions in unions of sets using a dyadic splitting technique. The exponential tail bounds yield the best known bounds for the l_1 norm of large rectangular matrices when restricted to act on vectors with few nonzeros; this quantity is referred to as the l_1 norm restricted isometry constants (RIC_1) of the matrix. These bounds allow for quantitative theorems on existence of expander graphs and hence the sparse random matrices we consider and also quantitative compressed sensing sampling theorems when using sparse non mean-zero measurement matrices.

# Compressed sensing with sparse, structured matrices

In the context of the compressed sensing problem we propose a new ensemble of sparse random matrices which allow (i) to acquire/compress a {\rho}-sparse signal of length N in a time linear in N and (ii) to perfectly recover the original signal, compressed at a rate {\alpha}, by using a message passing algorithm (Expectation Maximization Belief Propagation) that runs in a time linear in N. In the large N limit, the scheme proposed here closely approach the theoretical bound {\rho} = {\alpha}, and so it is both optimal and efficient (linear time complexity). More generally, we show that several ensembles of dense random matrices can be converted into ensembles of sparse random matrices, having the same thresholds, but much lower computational complexity.

# Sparse Recovery with Graph Constraints

Motivated by the need to monitor large-scale networks, this paper addresses the problem of sparse recovery with graph constraints. Unlike conventional sparse recovery where a measurement can contain any subset of the unknown variables(nodes), we take an additive measurement over nodes only if they induce a connected subgraph. We provide explicit measurement constructions for several special graphs, and the number of measurements by our construction is less than that needed by existing random constructions. Moreover, our construction for a line network is provably optimal in the sense that it requires the minimum number of measurements. A measurement construction algorithm for general graphs is also proposed and evaluated. For any given graph $G$ with $n$ nodes, we derive bounds of the minimum number of measurements needed to recover any $k$-sparse vector over $G$ ($M^G_{k,n}$). Using the Erd\H{o}s-R\'enyi random graph as an example, we characterize the dependence of $M^G_{k,n}$ on the graph structure. Our study suggests that $M^G_{k,n}$ may serve as a graph connectivity metric.

# Oblique Pursuits for Compressed Sensing

Compressed sensing is a new data acquisition paradigm enabling universal, simple, and reduced-cost acquisition, by exploiting a sparse signal model. Most notably, recovery of the signal by computationally efficient algorithms is guaranteed for certain randomized acquisition systems. However, there is a discrepancy between the theoretical guarantees and practical applications. In applications, including Fourier imaging in various modalities, the measurements are acquired by inner products with vectors selected randomly (sampled) from a frame. Currently available guarantees are derived using a so-called restricted isometry property (RIP), which has only been shown to hold under ideal assumptions. For example, the sampling from the frame needs to be independent and identically distributed with the uniform distribution, and the frame must be tight. In practice though, one or more of the ideal assumptions is typically violated and none of the existing guarantees applies.
Motivated by this discrepancy, we propose two related changes in the existing framework: (i) a generalized RIP called the restricted biorthogonality property (RBOP); and (ii) correspondingly modified versions of existing greedy pursuit algorithms, which we call oblique pursuits. Oblique pursuits are guaranteed using the RBOP without requiring ideal assumptions; hence, the guarantees apply to practical acquisition schemes. Numerical results show that oblique pursuits also perform competitively with, or sometimes better than their conventional counterparts.

# Greedy-Like Algorithms for the Cosparse Analysis Model

Raja GiryesSangnam Nam (INRIA - IRISA), Michael EladRémi Gribonval (INRIA - IRISA), Mike E. Davies
The cosparse analysis model has been introduced recently as an interesting alternative to the standard sparse synthesis approach. A prominent question brought up by this new construction is the analysis pursuit problem -- the need to find a signal belonging to this model, given a set of corrupted measurements of it. Several pursuit methods have already been proposed based on $\ell_1$ relaxation and a greedy approach. In this work we pursue this question further, and propose a new family of pursuit algorithms for the cosparse analysis model, mimicking the greedy-like methods -- compressive sampling matching pursuit (CoSaMP), subspace pursuit (SP), iterative hard thresholding (IHT) and hard thresholding pursuit (HTP). Assuming the availability of a near optimal projection scheme that finds the nearest cosparse subspace to any vector, we provide performance guarantees for these algorithms. Our theoretical study relies on a restricted isometry property adapted to the context of the cosparse analysis model. We explore empirically the performance of these algorithms by adopting a plain thresholding projection, demonstrating their good performance.

# Computation of 2-D spectra assisted by compressed sampling

The computation of scientific data can be very time consuming even if they are ultimately determined by a small number of parameters. The principle of compressed sampling suggests that we can achieve a considerable decrease in the computation time by avoiding the need to sample the full data set. We demonstrate the usefulness of this approach at the hand of 2-D spectra in the context of ultra-fast non-linear spectroscopy of biological systems where numerical calculations are highly challenging due to the considerable computational effort involved in obtaining individual data points.

# SHO-FA: Robust compressive sensing with order-optimal complexity, measurements, and bits

Suppose $x$ is any exactly $k$-sparse vector in $R^n$. We present a class of "sparse" matrices $A$, and a corresponding algorithm that we call SHO-FA (for Short and Fast) that, with high probability over $A$, can reconstruct $x$ from $Ax$. The SHO-FA algorithm is related to the Invertible Bloom Lookup Tables (IBLTs) recently introduced by Goodrich et al., with two important distinctions -- SHO-FA relies on linear measurements, and is robust to noise. The SHO-FA algorithm is the first to simultaneously have the following properties: (a) it requires only $O(k)$ measurements, (b) the bit-precision of each measurement and each arithmetic operation is $O(log(n)+P)$ (here $2^{-P}$ corresponds to the desired relative error in the reconstruction of $x$), (c) the computational complexity of decoding is $O(k)$ arithmetic operations, and (d) if the reconstruction goal is simply to recover a single component of $x$ instead of all of $x$, with high probability over $A$ this can be done in constant time. For a wide range of parameters these properties are information-theoretically order-optimal. In addition, our SHO-FA algorithm is robust to random noise, and (random) approximate sparsity for a large range of $k$. In particular, suppose the measured vector equals $A(x+z)+e$. Under reasonable statistical assumptions on $z$ and $e$ our decoding algorithm reconstructs $x$ with an estimation error of $O(||z||_1 +(log(k))^2 ||e||_1)$. The SHO-FA algorithm works with high probability over $A$, $z$, and $e$, and still requires only $O(k)$ steps and $O(k)$ measurements over $O(log(n))$-bit numbers. This is in contrast to most existing algorithms which focus on the "worst-case" $z$ model, where it is known $\Omega(k log(n/k))$ measurements over $O(log(n))$-bit numbers are necessary. Our algorithm has good empirical performance, as validated by simulations.

# Compressed Sensing of Approximately-Sparse Signals: Phase Transitions and Optimal Reconstruction

Compressed sensing is designed to measure sparse signals directly in a compressed form. However, most signals of interest are only "approximately sparse", i.e. even though the signal contains only a small fraction of relevant (large) components the other components are not strictly equal to zero, but are only close to zero. In this paper we model the approximately sparse signal with a Gaussian distribution of small components, and we study its compressed sensing with dense random matrices. We use replica calculations to determine the mean-squared error of the Bayes-optimal reconstruction for such signals, as a function of the variance of the small components, the density of large components and the measurement rate. We then use the G-AMP algorithm and we quantify the region of parameters for which this algorithm achieves optimality (for large systems). Finally, we show that in the region where the GAMP for the homogeneous measurement matrices is not optimal, a special "seeding" design of a spatially-coupled measurement matrix allows to restore optimality.

# Recoverability Analysis for Modified Compressive Sensing with Partially Known Support

The recently proposed modified-compressive sensing (modified-CS), which utilizes the partially known support as prior knowledge, significantly improves the performance of recovering sparse signals. However, modified-CS depends heavily on the reliability of the known support. An important problem, which must be studied further, is the recoverability of modified-CS when the known support contains a number of errors. In this letter, we analyze the recoverability of modified-CS in a stochastic framework. A sufficient and necessary condition is established for exact recovery of a sparse signal. Utilizing this condition, the recovery probability that reflects the recoverability of modified-CS can be computed explicitly for a sparse signal with \ell nonzero entries, even though the known support exists some errors. Simulation experiments have been carried out to validate our theoretical results.

# Signal Estimation with Arbitrary Error Metrics in Compressed Sensing

Compressed sensing typically deals with the estimation of a system input from its noise-corrupted linear measurements, where the number of measurements is smaller than the number of input components. The performance of the estimation process is usually quantified by some standard error metric such as squared error or support set error. In this correspondence, we consider a noisy compressed sensing problem with any arbitrary error metric. We propose a simple, fast, and highly general algorithm that estimates the original signal by minimizing the error metric defined by the user. We verify that our algorithm is optimal owing to the decoupling principle, and we describe a general method to compute the fundamental information-theoretic performance limit for any error metric. We provide two example metrics --- minimum mean absolute error and minimum mean support error --- and give the theoretical performance limits for these two cases. Experimental results show that our algorithm outperforms methods such as relaxed belief propagation (relaxed BP) and compressive sampling matching pursuit (CoSaMP), and reaches the suggested theoretical limits for our two example metrics.

# Restricted Isometry of Fourier Matrices and List Decodability of Random Linear Codes

We prove that a random linear code over F_q, with probability arbitrarily close to 1, is list decodable at radius (1-1/q-\epsilon) with list size L=O(1/\epsilon^2) and rate R=\Omega_q(\epsilon^2/(log^3(1/\epsilon))). Up to the polylogarithmic factor in (1/\epsilon) and constant factors depending on q, this matches the lower bound L=\Omega_q(1/\epsilon^2) for the list size and upper bound R=O_q(\epsilon^2) for the rate. Previously only existence (and not abundance) of such codes was known for the special case q=2 (Guruswami, H{\aa}stad, Sudan and Zuckerman, 2002).
In order to obtain our result, we employ a relaxed version of the well known Johnson bound on list decoding that translates the average Hamming distance between codewords to list decoding guarantees. We furthermore prove that the desired average-distance guarantees hold for a code provided that a natural complex matrix encoding the codewords satisfies the Restricted Isometry Property with respect to the Euclidean norm (RIP-2). For the case of random binary linear codes, this matrix coincides with a random submatrix of the Hadamard-Walsh transform matrix that is well studied in the compressed sensing literature.
Finally, we improve the analysis of Rudelson and Vershynin (2008) on the number of random frequency samples required for exact reconstruction of k-sparse signals of length N. Specifically, we improve the number of samples from O(k log(N) log^2(k) (log k + loglog N)) to O(k log(N) log^3(k)). The proof involves bounding the expected supremum of a related Gaussian process by using an improved analysis of the metric defined by the process. This improvement is crucial for our application in list decoding.

# On unified view of nullspace-type conditions for recoveries associated with general sparsity structures

We discuss a general notion of "sparsity structure" and associated recoveries of a sparse signal from its linear image of reduced dimension possibly corrupted with noise. Our approach allows for uni?ed treatment of (a) the "usual sparsity" and "usual $\ell_1$ recovery," (b) block-sparsity with possibly overlapping blocks and associated block-$\ell_1$ recovery, and (c) low-rank-oriented recovery by nuclear norm minimization. The proposed recovery routines are natural extensions of the usual $\ell_1$ minimization used in Compressed Sensing. Specifically we present nullspace-type sufficient conditions for the recovery to be precise on sparse signals in the noiseless case. Then we derive error bounds for imperfect (nearly sparse signal, presence of observation noise, etc.) recovery under these conditions. In all of these cases, we present efficiently verifiable sufficient conditions for the validity of the associated nullspace properties.

# Robust Dequantized Compressive Sensing

We consider the reconstruction problem in compressed sensing in which the observations are recorded in a finite number of bits. They may thus contain quantization errors (from being rounded to the nearest representable value) and saturation errors (from being outside the range of representable values). Our formulation has an objective of weighted $\ell_2$-$\ell_1$ type, along with constraints that account explicitly for quantization and saturation errors, and is solved with an augmented Lagrangian method. We prove a consistency result for the recovered solution, stronger than those that have appeared to date in the literature, showing in particular that asymptotic consistency can be obtained without oversampling. We present extensive computational comparisons with formulations proposed previously, and variants thereof.

# Greedy Algorithms for Sparse Reinforcement Learning

Christopher Painter-Wakefield (Duke University), Ronald Parr (Duke University)
Feature selection and regularization are becoming increasingly prominent tools in the efforts of the reinforcement learning (RL) community to expand the reach and applicability of RL. One approach to the problem of feature selection is to impose a sparsity-inducing form of regularization on the learning method. Recent work on $L_1$ regularization has adapted techniques from the supervised learning literature for use with RL. Another approach that has received renewed attention in the supervised learning community is that of using a simple algorithm that greedily adds new features. Such algorithms have many of the good properties of the $L_1$ regularization methods, while also being extremely efficient and, in some cases, allowing theoretical guarantees on recovery of the true form of a sparse target function from sampled data. This paper considers variants of orthogonal matching pursuit (OMP) applied to reinforcement learning. The resulting algorithms are analyzed and compared experimentally with existing $L_1$ regularized approaches. We demonstrate that perhaps the most natural scenario in which one might hope to achieve sparse recovery fails; however, one variant, OMP-BRM, provides promising theoretical guarantees under certain assumptions on the feature dictionary. Another variant, OMP-TD, empirically outperforms prior methods both in approximation accuracy and efficiency on several benchmark problems.

# A Combinatorial Algebraic Approach for the Identifiability of Low-Rank Matrix Completion

Franz Kiraly (TU Berlin), Ryota Tomioka (University of Tokyo)
In this paper, we review the problem of matrix completion and expose its intimate relations with algebraic geometry, combinatorics and graph theory. We present the first necessary and sufficient combinatorial conditions for matrices of arbitrary rank to be identifiable from a set of matrix entries, yielding theoretical constraints and new algorithms for the problem of matrix completion. We conclude by algorithmically evaluating the tightness of the given conditions and algorithms for practically relevant matrix sizes, showing that the algebraic-combinatoric approach can lead to improvements over state-of-the-art matrix completion methods.

# Conditional Sparse Coding and Grouped Multivariate Regression

Min Xu (Carnegie Mellon University), John Lafferty (University of Chicago)
We study the problem of multivariate regression where the data are naturally grouped, and a regression matrix is to be estimated for each group. We propose an approach in which a dictionary of low rank parameter matrices is estimated across groups, and a sparse linear combination of the dictionary elements is estimated to form a model within each group. We refer to the method as conditional sparse coding since it is a coding procedure for the response vectors Y conditioned on the covariate vectors X. This approach captures the shared information across the groups while adapting to the structure within each group. It exploits the same intuition behind sparse coding that has been successfully developed in computer vision and computational neuroscience. We propose an algorithm for conditional sparse coding, analyze its theoretical properties in terms of predictive accuracy, and present the results of simulation and brain imaging experiments that compare the new technique to reduced rank regression.

# Small-sample Brain Mapping: Sparse Recovery on Spatially Correlated Designs with Randomization and Clustering

Gael Varoquaux (INRIA), Alexandre Gramfort (INRIA), Bertrand Thirion (INRIA)
Functional neuroimaging can measure the brain?s response to an external stimulus. It is used to perform brain mapping: identifying from these observations the brain regions involved. This problem can be cast into a linear supervised learning task where the neuroimaging data are used as predictors for the stimulus. Brain mapping is then seen as a support recovery problem. On functional MRI (fMRI) data, this problem is particularly challenging as i) the number of samples is small due to limited acquisition time and ii) the variables are strongly correlated. We propose to overcome these difficulties using sparse regression models over new variables obtained by clustering of the original variables. The use of randomization techniques, e.g. bootstrap samples, and clustering of the variables improves the recovery properties of sparse methods. We demonstrate the benefit of our approach on an extensive simulation study as well as two fMRI datasets.

# Gap Filling in the Plant Kingdom---Trait Prediction Using Hierarchical Probabilistic Matrix Factorization

Hanhuai Shan (University of Minnesota), Jens Kattge (Max Planck Institute for Biogeochemistry), Peter Reich (University of Minnesota), Arindam Banerjee (University of Minnesota), Franziska Schrodt (University of Minnesota), Markus Reichstein (Max Planck Institute for Biogeochemistry)
Plant traits are a key to understanding and predicting the adaptation of ecosystems to environmental changes, which motivates the TRY project aiming at constructing a global database for plant traits and becoming a standard resource for the ecological community. Despite its unprecedented coverage, a large percentage of missing data substantially constrains joint trait analysis. Meanwhile, the trait data is characterized by the hierarchical phylogenetic structure of the plant kingdom. While factorization based matrix completion techniques have been widely used to address the missing data problem, traditional matrix factorization methods are unable to leverage the phylogenetic structure. We propose hierarchical probabilistic matrix factorization (HPMF), which effectively uses hierarchical phylogenetic information for trait prediction. We demonstrate HPMF's high accuracy, effectiveness of incorporating hierarchical structure and ability to capture trait correlation through experiments.

# Learning Invariant Representations with Local Transformations

Kihyuk Sohn (University of Michigan), Honglak Lee (University of Michigan)
Learning invariant representations is an important problem in machine learning and pattern recognition. In this paper, we present a novel framework of transformation-invariant feature learning by incorporating linear transformations into the feature learning algorithms. For example, we present the transformation-invariant restricted Boltzmann machine that compactly represents data by its weights and their transformations, which achieves invariance of the feature representation via probabilistic max pooling. In addition, we show that our transformation-invariant feature learning framework can also be extended to other unsupervised learning methods, such as autoencoders or sparse coding. We evaluate our method on several image classification benchmark datasets, such as MNIST variations, CIFAR-10, and STL-10, and show competitive or superior classification performance when compared to the state-of-the-art. Furthermore, our method achieves state-of-the-art performance on phone classification tasks with the TIMIT dataset, which demonstrates wide applicability of our proposed algorithms to other domains.

# Large-Scale Feature Learning With Spike-and-Slab Sparse Coding

Ian Goodfellow (Universite de Montreal), Aaron Courville (Universite de Montreal), Yoshua Bengio(Universite de Montreal)
We consider the problem of object recognition with a large number of classes. In order to overcome the low amount of labeled examples available in this setting, we introduce a new feature learning and extraction procedure based on a factor model we call spike-and-slab sparse coding (S3C). Prior work on S3C has not prioritized the ability to exploit parallel architectures and scale S3C to the enormous problem sizes needed for object recognition. We present a novel inference procedure for appropriate for use with GPUs which allows us to dramatically increase both the training set size and the amount of latent factors that S3C may be trained with. We demonstrate that this approach improves upon the supervised learning capabilities of both sparse coding and the spike-and-slab Restricted Boltzmann Machine (ssRBM) on the CIFAR-10 dataset. We use the CIFAR-100 dataset to demonstrate that our method scales to large numbers of classes better than previous methods. Finally, we use our method to win the NIPS 2011 Workshop on Challenges In Learning Hierarchical Models? Transfer Learning Challenge.

# Stochastic optimization and sparse statistical recovery: An optimal algorithm for high dimensions

We develop and analyze stochastic optimization algorithms for problems in which the expected loss is strongly convex, and the optimum is (approximately) sparse. Previous approaches are able to exploit only one of these two structures, yielding an $\order(\pdim/T)$ convergence rate for strongly convex objectives in $\pdim$ dimensions, and an $\order(\sqrt{(\spindex \log \pdim)/T})$ convergence rate when the optimum is $\spindex$-sparse. Our algorithm is based on successively solving a series of $\ell_1$-regularized optimization problems using Nesterov's dual averaging algorithm. We establish that the error of our solution after $T$ iterations is at most $\order((\spindex \log\pdim)/T)$, with natural extensions to approximate sparsity. Our results apply to locally Lipschitz losses including the logistic, exponential, hinge and least-squares losses. By recourse to statistical minimax results, we show that our convergence rates are optimal up to multiplicative constant factors. The effectiveness of our approach is also confirmed in numerical simulations, in which we compare to several baselines on a least-squares regression problem.

# Kernelized Supervised Dictionary Learning

In this paper, we propose supervised dictionary learning (SDL) by incorporating information on class labels into the learning of dictionary. To this end, we propose to learn the dictionary in a space where the dependency between the signals and their corresponding labels is maximized. To maximize this dependency, the recently introduced Hilbert Schmidt independence criterion (HSIC) is used. One of the main advantages of this novel approach for SDL is that it can be easily kernelized by incorporating a kernel, particularly a data-derived kernel such as normalized compression distance, into the formulation. The learned dictionary is compact and the proposed approach is fast. We show that it outperforms other unsupervised and supervised dictionary learning approaches in the literature on real-world data.

Fast and Accurate Algorithms for Re-Weighted l_1-Norm Minimization by M. Salman Asif and Justin Romberg. The abstract reads:
To recover a sparse signal from an underdetermined system, we often solve a constrained 1-norm minimization problem. In many cases, the signal sparsity and the recovery performance can be further improved by replacing the 1 norm with a “weighted” 1 norm. Without any prior information about nonzero elements of the signal, the procedure for selecting weights is iterative in nature. Common approaches update the weights at every iteration using the solution of a weighted 1 problem from the previous iteration.
In this paper, we present two homotopy-based algorithms that efﬁciently solve reweighted 1 problems. First, we present an algorithm that quickly updates the solution of a weighted 1 problem as the weights change. Since the solution changes only slightly with small changes in the weights, we develop a homotopy algorithm that replaces the old weights with the new ones in a small number of computationally inexpensive steps. Second, we propose an algorithm that solves a weighted `1 problem by adaptively selecting the weights while estimating the signal. This algorithm integrates the reweighting into every step along the homotopy path by changing the weights according to the changes in the solution and its support, allowing us to achieve a high quality signal reconstruction by solving a single homotopy problem.
We compare the performance of both algorithms, in terms of reconstruction accuracy and computational complexity, against state-of-the-art solvers and show that our methods have smaller computational cost. In addition, we will show that the adaptive selection of the weights inside the homotopy often yields reconstructions of higher quality

Ranked Sparse Signal Support Detection by Alyson K. Fletcher, Sundeep Rangan, and Vivek K Goyal. The abstract reads:

This paper considers the problem of detecting the support (sparsity pattern) of a sparse vector from random noisy measurements. Conditional power of a component of the sparse vector is deﬁned as the energy conditioned on the component being nonzero. Analysis of a simpliﬁed version of orthogonal matching pursuit (OMP) called sequential OMP (SequOMP) demonstrates the importance of knowledge of the rankings of conditional powers. When the simple SequOMP algorithm is applied to components in nonincreasing order of conditional power, the detrimental effect of dynamic range on thresholding performance is eliminated. Furthermore, under the most favorable conditional powers, the performance of SequOMP approaches maximum likelihood performance at high signal-to-noise ratio.

Blind image deconvolution with spatially adaptive total variation regularization by Luxin Yan, Houzhang Fang, and Sheng Zhong. The abstract reads:
A blind deconvolution algorithm with spatially adaptive total variation regularization is introduced. The spatial information in different image regions is incorporated into regularization by using the edge indicator called difference eigenvalue to distinguish edges from flat areas. The proposed algorithm can effectively reduce the noise in flat regions as well as preserve the edge and detailed information. Moreover, it becomes more robust with the change of the regularization parameter. Comparative results on simulated and real degraded images are reported.

A Reconstruction Algorithm for Compressed Sensing Noise Signal by Yulou PENG, Yigang HE. The abstract reads:
The novel theory of Compressed Sensing (CS) reduces the samples of compressible signal sharply by information sampling. In order to improve reconstruction accuracy of noise signal for CS, a Singular Value Decomposition (SVD) noise signal reconstruction algorithm is presented in this paper. This algorithm decomposes the random measurement matrix, modi es the diagonal matrix Eigen values by mean algorithm and then obtains the new linear measurement matrix. The new matrix has higher reconstruction accuracy and this is proved theoretically. The simulation results also show that this algorithm can enhance reconstruction accuracy value in the range of 3 { 5% compared to the existing orthogonal matching pursuit (OMP) algorithm.

Colorization by Matrix Completion
by Shusen Wang and Zhihua Zhang.The abstract reads:
Given a monochrome image and some manually labeled pixels, the colorization problem is a computer-assisted process of adding color to the monochrome image. This paper proposes a novel approach to the colorization problem by formulating it as a matrix completion problem. In particular, taking a monochrome image and parts of the color pixels (labels) as inputs, we develop a robust colorization model and resort to an augmented Lagrange multiplier algorithm for solving the model. Our approach is based on the fact that a matrix can be represented as a low-rank matrix plus a sparse matrix. Our approach is effective because it is able to handle the potentialnoises in the monochrome image and outliers in the labels. Toimprove the performance of our method, we further incorporate a so-called local-color-consistency idea into our method.Empirical results on real data sets are encouraging

Dictionary Learning on Riemannian Manifolds by Yuchen Xie Baba C. Vemuri, Jeffrey Ho.The abstract reads:
Existing dictionary learning algorithms rely heavily on the assumption that the data points are vectors in some Euclidean space Rd, and the dictionary is learned from the input data using only the vector space structure of Rd However, in many applications, features and data points often belong to some Riemannian manifold with its intrinsic metric structure that is potentially important and critical to the application. The extrinsic viewpoint of existing dictionary learning methods becomes inappropriate and inadequate if the intrinsic geometric structure is required to be incorporated in the model. In this paper, we develop a very general dictionary learning framework for data lying on some known Riemannnian manifolds. Using the local linear structures furnished by the Riemannian geometry, we propose a novel dictionary learning algorithm that can be considered as data-speciﬁc, a feature that is not present in the existing methods. We show that both the dictionary and sparse coding can be effectively computed for Riemannian manifolds. We validate the proposed method using a classiﬁcation problem in medical image analysis. The preliminary results demonstrate that the dictionary learned using the proposed method can and does indeed provide real improvements when compared with other direct approaches.

Matrix Completion by Truncated Nuclear Norm Regularization
Debing Zhang, Yao Hu, Jieping Ye, Xuelong Li, Xiaofei He.The abstract reads:
Estimating missing values in visual data is a challenging problem in computer vision, which can be considered as a low rank matrix approximation problem. Most of the recent studies use the nuclear norm as a convex relaxation of the rank operator. However, by minimizing the nuclear norm, all the singular values are simultaneously minimized, and thus the rank can not be well approximated in practice. In this paper, we propose a novel matrix completion algorithm based on the Truncated Nuclear Norm Regularization (TNNR) by only minimizing the smallest N − r singular values, where N is the number of singular values and r is the rank of the matrix. In this way, the rank of the matrix can be better approximated than the nuclear norm. We further develop an efﬁcient iterative procedure to solve the optimization problem by using the alternating direction method of multipliers and the accelerated proximal gradient line search method. Experimental results in a wide range of applications demonstrate the effectiveness of our proposed approach.

Image Collection Summarization via Dictionary Learning for Sparse Representation by Chunlei Yang, Jinye Peng, Jianping Fan.The abstract reads:

In this paper, a novel framework is developed to achieve effective summarization of large-scale image collection by treating the problem of automatic image summarization as the problem of dictionary learning for sparse representation, e.g., the summarization task can be treated as a dictionary learning task (i.e., the given image set can be reconstructed sparsely with this dictionary). For image set of a speciﬁc category or a mixture of multiple categories, we have built a sparsity model to reconstruct all its images by using a subset of most representative images (i.e., image summary); and we adopted the simulated annealing algorithm to learn such sparse dictionary by minimizing an explicit optimization function. By investigating their reconstruction ability under sparsity constrain and diversity constrain, we have quantitatively measure the performance of various summarization algorithms. Our experimental results have shown that our dictionary learning for sparse representation algorithm can obtain more accurate summary as compared with other baseline algorithms.

Daniele Barchiesi, Mark D. Plumbley.The abstract reads:

This article deals with learning dictionaries for sparse approximation whose atoms are both adapted to a training set of signals and mutually incoherent. To meet this objective, we employ a dictionary learning scheme consisting of sparse approximation followed by dictionary update and we add to the latter a decorrelation step in order to reach a target mutual coherence level. This step is accomplished by an iterative projection method followed by a rotation of the dictionary. Experiments on musical audio data illustrate that the proposed algorithm can learn highly incoherent dictionaries while providing a sparse approximation with good signal to noise ratio. We show that using dictionaries with low mutual coherence leads to a faster rate of decay of the residual when using MP for sparse approximation.

Superiorization: An optimization heuristic for medical physics by  G.T. Herman, E. Garduno, R. Davidi and Y. Censor,

Purpose:
To describe and mathematically validate the superiorization methodology, which is a recently-developed heuristic approach to optimization, and to discuss its applicability to medical physics problem formulations that specify the desired solution (of physically given or otherwise obtained constraints) by an optimization criterion.
Methods:
The superiorization methodology is presented as a heuristic solver for a large class of constrained optimization problems. The constraints come from the desire to produce a solution that is constraints-compatible, in the sense of meeting requirements provided by physically or otherwise obtained constraints. The underlying idea is that many iterative algorithms for finding such a solution are perturbation resilient in the sense that, even if certain kinds of changes are made at the end of each iterative step, the algorithm still produces a constraints-compatible solution. This property is exploited by using permitted changes to steer the algorithm to a solution that is not only constraints-compatible, but is also desirable according to a specified optimization criterion. The approach is very general, it is applicable to many iterative procedures and optimization criteria used in medical physics.
Results:
The main practical contribution is a procedure for automatically producing from any given iterative algorithm its superiorized version, which will supply solutions that are superior according to a given optimization criterion. It is shown that if the original iterative algorithm satisfies certain mathematical conditions, then the output of its superiorized version is guaranteed to be as constraints-compatible as the output of the original algorithm, but it is superior to the latter according to the optimization criterion. This intuitive description is made precise in the paper and the stated claims are rigorously proved. Superiorization is illustrated on simulated computerized tomography data of a head cross-section and, in spite of its generality, superiorization is shown to be competitive to an optimization algorithm that is specifically designed to minimize total variation.
Conclusions:
The range of applicability of superiorization to constrained optimization problems is very large. Its major utility is in the automatic nature of producing a superiorization algorithm from an algorithm aimed at only constraints-compatibility; while non-heuristic (exact) approaches need to be redesigned for a new optimization criterion. Thus superiorization provides a quick route to algorithms for the practical solution of constrained optimization problems.