Tuesday, March 23, 2010

CS: A long entry


Craig Stuntz asks What is Homomorphic Encryption, and Why Should I Care? We have seen homomorphic transformation before and how it parallels compressed sensing.

Bob Sturm recently joined the University of Aalborg in Denmark and he and others in his group are writing a blog on issues related to dictionary building in sound. The blog is here.

For some odd reasons, I have not kept track of the presentations made by Allen Yang in the past year, here is an interesting list:
  • CVPR Tutorial, 2009: Sparse Representation and Its Applications in Computer Vision.
    • Introduction: Face Recognition [slides]
    • Comopressive Sensing Theory [slides]
    • Applications: Distributed Sensing and Perception [slides]
    • Robust PCA [slides]
  • Spring, 2009: High-Dimensional Multi-Model Estimation -- A Mini Course. [slides]
  • Spring, 2008: Compressed Sensing Meets Machine Learning -- A Mini Course. [lecture 1, lecture 2, lecture 3]

The Restricted Isometry Constants (RIC) of a matrix A measures how close to an isometry is the action of A on vectors with few nonzero entries, measured in the l2 norm. Specifically, the upper and lower RIC of a matrix A of size n × N is the maximum and the minimum deviation from unity (one) of the largest and smallest, respectively, square of singular values of all (N,k) matrices formed by taking k columns from A. Calculation of the RIC is intractable for most matrices due to its combinatorial nature; however, many random matrices typically have bounded RIC in some range of problem sizes (k, n,N). We provide the best known bound on the RIC for Gaussian matrices, which is also the smallest known bound on the RIC for any large rectangular matrix. Improvements over prior bounds are achieved by exploiting similarity of singular values for matrices which share a substantial number of columns.

The site where one can input \delta and\rho to get the upper and lower bound on the restricted isometry constant is at: http://ecos.maths.ed.ac.uk/ric_bounds.shtml. Also of related interest we have:

A generic tool for analyzing sparse approximation algorithms is the restricted isometry property (RIP) introduced by Candes and Tao. For qualitative comparison of sufficient conditions derived from an RIP analysis, the support size of the RIP constants is generally reduced as much as possible with the goal of achieving a support size of twice the sparsity of the target signal. Using a quantitative comparison via phase transitions for Gaussian measurement matrices, three examples from the literature of such support size reduction are investigated. In each case, utilizing a larger support size for the RIP constants results in a sufficient condition for exact sparse recovery satis fied, with high probability, by a signi ficantly larger subset of Gaussian matrices.
My webcrawler also found the following papers on the interweb: The slides are for a paper I mentioned before Turbo Reconstruction of Structured Sparse Signals by Phil Schniter. The abstract reads:
This paper considers the reconstruction of structured-sparse signals from noisy linear observations. In particular, the support of the signal coefficients is parameterized by hidden binary pattern, and a structured probabilistic prior (e.g., Markov random chain/field/tree) is assumed on the pattern. Exact inference is discussed and an approximate inference scheme, based on loopy belief propagation (BP), is proposed. The proposed scheme iterates between exploitation of the observation-structure and exploitation of the pattern-structure, and is closely related to noncoherent turbo equalization, as used in digital communication receivers. An algorithm that exploits the observation structure is then detailed based on approximate message passing ideas. The application of EXIT charts is discussed, and empirical phase transition plots are calculated for Markov-chain structured sparsity

Here a paper that reminded me of the presentation of David Field at MIA'09 ( specifically this paper: Chandler DM, and Field DJ. (2007). "Estimates of the Information Content and Dimensionality of Natural Scenes From Proximity Distributions." Journal of the Optical Society of America A, Vol. 24, Issue 4, pp. 922-941. [PDF]): Optimized intrinsic dimension estimation using nearest neighbor graphs by Kumar Sricharan, Raviv Raich, Alfred O. Hero III. The abstract reads:
We develop an approach to intrinsic dimension estimation based on k-nearest neighbor (kNN) distances. The dimension estimator is derived using a general theory on functionals of kNN density estimates. This enables us to predict the performance of the dimension estimation algorithm. In addition, it allows for optimization of free parameters in the algorithm. We validate our theory through simulations and compare our estimator to previous kNN based dimensionality estimation approaches.

Regularization is a principled way to control model complexity, prevent overfitting, and incorporate ancillary information into the learning process. As a convex relaxation of ℓ0-norm, ℓ1-norm regularization is popular for learning in high-dimensional spaces, where a fundamental assumption is the sparsity of model parameters. However, model sparsity can be restrictive and not necessarily the most appropriate assumption in many problem domains. In this paper, we relax the sparsity assumption to compressibility and propose learning compressible models: a compression operation can be included into ℓ1-regularization and thus model parameters are compressed before being penalized. We concentrate on the design of different model compression transforms, which can encode various assumptions on model parameters, e.g., local smoothness, frequency-domain energy compaction, and
correlation. Use of a compression transform inside the ℓ1 penalty term provides an opportunity to include information from domain knowledge, coding theories, unlabeled data, etc. We conduct extensive experiments on brain-computer interface, handwritten character recognition, and text classification. Empirical results show significant improvements in prediction performance by learning compressible models instead of sparse models. We also analyze the model fitting and learned model coefficients under different compressibility assumptions, which demonstrate the advantages of learning compressible models instead of sparse models.

Compressed Detection via Manifold Learning by Hyun Jeong Cho, Kuang-Hung Liu, Jae Young Park. The introduction of the paper reads:
In many imaging applications such as Computed Tomography (CT) in medical imaging and Synthetic Aperture Radar (SAR) imaging, the collected raw data, residing in RM, of the receiver or detector can be modeled as data lying in the Fourier space of the target reflectivity function. The magnitude of the reflectivity is considered as the image of the target, and we are often interested in detecting specific features in the image, e.g. tumor in medical imaging and military weapons in SAR imaging. A natural way to achieve this goal is to form an image in RN, where N > M, and detect the feature in the spatial domain. Recent development of a theory in [1] states that under certain conditions, random linear projections, © : RN 7! RL, guarantees, with high probability, that all pairwise Euclidean and geodesic distances between points on M½ RN are well preserved under the mapping This made us wonder if the geodesic distances of the original image, RN, and the corresponding raw data, RM, in SAR imaging were also reasonably preserved. Motivated by satisfactory results of simple tests to check this, which will be discussed in more detail later, we tried to detect features directly from the lower dimensional, or compressed, SAR raw data, without involving any image reconstruction step. In this report, manifold learning techniques are applied to reduce the dimension of the raw data, and is followed by a detection step to achieve our goal. The theoretical framework will be discussed later in more detail. To our best knowledge, there has not yet been a successful image detection or classification algorithm that takes advantage of this framework and works directly on the domain where the raw data resides in. Since existing algorithms work in the spatial domain, the algorithms first have to transform the raw data to the spatial domain. This transformation step could be time consuming, and in general results in loss of information. Also, working with M dimensional data will be computationally less demanding. Thus, it is interesting to look at the problem of working directly on the raw data. The rest of the report is structured as follows. First, technical background such as the theoretical framework mentioned above, the Laplacian Eigenmaps as a method of dimensionality reduction, and semi-supervised learning algorithm will be discussed. Carrying on, experimental results will be presented. Finally, we will conclude this report by giving discussion on the challenges and future works.



Image Credit: NASA/JPL/Space Science Institute, Titan as seen on March 16, 2010 from Cassini.

No comments:

Printfriendly