Friday, October 31, 2008

CS: Adaptive Sampling: Sequential adaptive compressed sampling via Huffman codes, Compressive Structured Light for Participating Media

The question of adaptive sampling comes back often even though the initial intent of compressed sensing is to have a linear acquisition process. In effect, if one is to nonlinearly acquire signals, how is that different from JPEG and more specialized compression techniques ? Well for one, if we had to wait for a standard like JPEG for high-dimensional signals, we'd have to wait for a long, no make that a looooooong time. Hence it becomes obvious that one should look at CS as a way to perform a nonlinear compr
ession scheme in fields where
 the amount of (economic?) interest is very restricted. Here is the latest entry on this matter:

Sequential adaptive compressed sampling via Huffman codes by Akram Aldroubi, Haichao Wang, Kourosh Zarringhalam. The abstract reads:
There are two main approaches in compressed sensing: the geometric approach and the combinatorial approach. In this paper we introduce an information theoretic approach and use results from the theory of Huffman codes to construct a sequence of binary sampling vectors to determine a sparse
 signal. Unlike other approaches, our approach is adaptive in the sense that each sampling vector depends on the previous sample. The number of measurements we
 need for a k-sparse vector in n-dimensional space is no more than O(k log n) and the reconstruction is O(k).

The O(k log n) result is very interesting but I am a little bothered about the "unlike other approaches.." as "unlike most other approaches.." would have
 sufficed. We have seen adaptive measurements before (as listed in the Big Picture page ):

I have mentioned them before, but let me do this again as 

Let us note the following in the page:

In our experiment, we use the vertical binary stripe patterns as the coded light pattern. Each frame has 128 stripes. The stripes are randomly assigned to be 0 (black) or 1 (white) according to Bernoulli distribution (with p=0.5). We also tried Hadamard codes and found that the Bernoulli random code is better.

I am sure they will also be able to use adaptive measurements at some point in time.

Image Credit: NASA/JPL/Space Science Institute, Have you ever seen star shining through Saturn's ring ? W00050600.jpg was taken on October 28, 2008 and received on Earth October 29, 2008.

Thursday, October 30, 2008

CS: Multiplexing in Interferometry - Diophantine Optics

I mentionned the work of Daniel Rouan before. His main problem is to acquire some signal in some analog fashion, multiplex it and then remove some pattern from it (within the analog process). He cannot digitized the signal because it would require him to deal with technology that still does not exist or is at a very low TRL.

As you might have guessed, Daniel Rouan does interferometry. He is interested in exoplanet detection through the removal of light coming from a star while observing/keeping the much lower (by 7 to 10 orders of magnitude) brightness of a nearby planet. He does this by devising nulling interferometers (nulling because it cancels out the main star) that implement a Prouhet-Tarry-Escott [2] (PTE) series in hardware so that the analog signal uses the cancellation property of that series as explained in Diophantine Optics explanation:
Each of the beams of light interact with a paved lattice. The two paved lattices implement each one side of the PTE series thereby producing beams that are then added together producing the nulling effect.
Different kinds of pavement can fulfill this condition, so additional constraints are added to avoid shadowing between tiles.... An example of a working tiling is shown below.

There are other examples of diophatine optics applied to inteferometry here. Back in 2004 when I first mentioned his work, he was mostly presenting the algebra of ultra-supressing inteferometers:

The last tiling cannot not remind some of you of similar tilings found in CS. The PTE series might well be very appropriate for star shining nulling, but one wonders if using other kinds of tilings and CS reconstruction techniques, one cannot get other type of results, i.e. not just nulling effects.


Wednesday, October 29, 2008

CS: Comment on non uniform sampling, Digital Rain, a post-doc, a thesis

Most of today's entries were found on the interwebs:

....(2) According to Holger Rauhut: "The ideas of compressed sensing are not completely new. There are actually many precursors. For instance, the use of [l.sub.1] minimization seems to be first mentioned in the Ph.D. thesis of Logan in 1965. [l.sub.1] minimization was further used by geophysicists starting in 1970s. In statistics the field of variable selection has introduced [l.sub.1] minimization (called LASSO) and greedy algorithms (stepwise, forward regression, and projection pursuit regression, ere) in the 1980s and 1990s. The new contribution of compressed sensing consists in the type of applications, and in rigorous (and often fascinating) proofs of recovery results using new concepts and connections to other fields of mathematics." The paper by [17] is, in a sense, compressed sensing for nonlinear transformations....

...[17] F. Marvasti and A. K. Jain, Zero crossings, bandwidth compression, and restoration of nonlinearly distorted band-limited signals, Journal of Opt. Soc. of America, 3(5), 651-654, 1986....
There is a new field where CS is being used as featured in One Video Stream to Serve Diverse Receivers by Szymon Chachulski, Dina Katabi and Grace Woo. The abstract reads:
The fundamental problem of wireless video multicast is to scalably serve multiple receivers which may have very different channel characteristics. Ideally, one would like to broadcast a single stream that allows each receiver to benefit from all correctly received bits to improve its video quality. We introduce Digital Rain, a new approach to wireless video multicast that adapts to channel characteristics without any need for receiver feedback or variable codec rates. Users that capture more packets or have fewer bit errors naturally see higher video quality. Digital Rain departs from current approaches in two ways: 1) It allows a receiver to exploit video packets that may contain bit errors; 2) It builds on the theory of compressed sensing to develop robust video encoding and decoding algorithms that degrade smoothly with bit errors and packet loss. Implementation results from an indoor wireless testbed show that Digital Rain significantly improves the received video quality and the number of supported receivers.

A PostDoc has just been listed on the CSjob section. It is a FP6/Marie Curie Fellowship for a Post-doctoral fellow in Sensor Networks & Compressive Sensing at the Institute of Computer Science (ICS), Foundation for Research and Technology-Hellas (FORTH) in Heraklion, Crete, Greece. The position focuses on information theoretical aspects of sensor networks and Bayesian compressive sensing. The ideal candidate is expected to have expertise in information theory, signal processing, compressive sensing, pattern recognition and data fusion, statistical communications, and preferably experience in engineering applications concerning the structure, function, and organisation of WSN systems. Job starting date: 01/01/2009. Aplication deadline: 01/12/2008.

I also found this, but did not find the actual document. Análisis y reconocimiento de patrones en electroforesis capilar utilizando compressed sensing by Alvaro Hernández Orence, Hernández Orence, Alvaro, Paredes Quintero, José Luis at Universidad de Los Andes in Venezuela. CS is electrophoresis, I'd like to see more of that.

There is a call for a Proceedings of the IEEE Special Issue on: Applications of Sparse Representation and Compressive Sensing. But it should be noted that:

Tuesday, October 28, 2008

CS: Seismic CS, CS audio and image tampering, Exploiting structure in wavelet-based bayesian CS, Practical near-optimal sparse recovery

Here is a new batch from the Rice repository. I am listing only those that I have not covered before;

Efficient Seismic Forward Modeling using Simultaneous Random Sources and Sparsity by Ramesh Neelamani, C. Krohn, J. Krebs, M. Deffenbaugh, J. E. Anderson and Justin Romberg. The abstract reads:

This paper proposes an approach to speed up seismic forward modeling when the Green's function of a given model is structured in the sense that the Green's function has a sparse representation in some known transform domain. The first step of our approach runs a forward finite-difference (FD) simulation for a duration longer than each conventional run, but with all sources activated simultaneously using different long random noise waveforms. The cumulative responses to the simultaneous sources are measured at all receivers. The second step separates the interfering Green's functions from the receiver measurements by exploiting prior knowledge of the random waveforms and the sparsity of the Green's function in a suitable domain. Simulation results demonstrate such a simultaneous source approach is indeed promising.

We mentioned them before here:

 Identification of sparse audio tampering using distributed source coding and compressive sensing techniques by Giorgio Prandi, Giuseppe ValenziseMarco TagliasacchiAugusto Sarti [See also related conference publication: DAFX 2008]. The abstract reads:

The increasing development of peer-to-peer networks for delivering and sharing multimedia files poses the problem of how to protect these contents from unauthorized and, possibly, malicious manipulations. In the past few years, a large amount of techniques, including multimedia hashes and digital watermarking, have been proposed to identify whether a multimedia content has been illegally tampered or not. Nevertheless, very few efforts have been devoted to identifying which kind of attack has been carried out, with the aim of assessing whether the modified content is still meaningful for the final user and, hopefully, of recovering the original content semantics. One of the main issues that have prevented multimedia hashes from being used for tampering identification is the large amount of data required for this task. Generally the size of the hash should be kept as small as possible to reduce the bandwidth overhead. To overcome this limitation, we propose a novel hashing scheme which exploits the paradigms of compressive sensing and distributed source coding to generate a compact hash signature, and apply it to the case of audio content protection. The audio content provider produces a small hash signature by computing a limited number of random projections of a perceptual, time-frequency representation of the original audio stream; the audio hash is given by the syndrome bits of an LDPC code applied to the projections. At the content user side, the hash is decoded using distributed source coding tools, provided that the distortion introduced by tampering is not too high. If the tampering is sparsifiable or compressible in some orthonormal basis or redundant dictionary (e.g. DCT or wavelet), it is possible to identify the time-frequency position of the attack, with a hash size as small as 200 bits/second: the bit saving obtained by introducing distributed source coding ranges between 20% to 70%.

 Hash-based identification of sparse image tampering by Giorgio Prandi, Giuseppe ValenziseMarco TagliasacchiAugusto Sarti [See also related conference publication: ICIP 2008]. The abstract reads: 

In the last decade, the increased possibility to produce, edit and disseminate multimedia contents has not been adequately balanced by similar advances in protecting these contents from unauthorized diffusion of forged copies. When the goal is to detect whether or not a digital content has been tampered with in order to alter its semantics, the use of multimedia hashes turns out to be an effective solution to offer proof of legitimacy and to eventually identify the introduced tampering. We propose an image hashing algorithm based on compressive sensing principles, which solves both the authentication and the tampering identification problems. The original content producer generates a hash using a small bit budget by quantizing a limited number of random projections of the authentic image. The content user receives the (possibly altered) image, and uses the hash to estimate the mean square error distortion between the original and the received image. In addition, if the introduced tampering is sparse in some orthonormal basis or redundant dictionary, an approximation is given in the pixel domain. We emphasize that the hash is universal, e.g. the same hash signature can be used to detect and identify different types of tampering. At the cost of additional complexity at the decoder, the proposed algorithm is robust to moderate content-preserving transformations including cropping, scaling and rotation. In addition, in order to keep the size of the hash small, hash encoding/decoding takes advantage of distributed source codes.

There is also Exploiting structure in wavelet-based bayesian compressed sensing by Lihan He and Lawrence Carin. The abstract reads:

Bayesian compressive sensing (CS) is considered for signals and images that are sparse in a wavelet basis. The statistical structure of the wavelet coefficients is exploited explicitly in the proposed model, and therefore this framework goes beyond simply assuming that the data are compressible in a wavelet basis. The structure exploited within the wavelet coefficients is consistent with that used in wavelet based compression algorithms. A hierarchical Bayesian model is constituted, with efficient inference via Markov chain Monte Carlo (MCMC) sampling. The algorithm is fully developed and demonstrated using several natural images, with performance comparisons to many state-of-the-art compressive-sensing inversion algorithms.
As far as I can tell, the TSW-CS algorithm is not available yet.

Finally, I mentioned their wiki but had forgotten to mention one of the papers that is the basis of the wikiPractical near-optimal sparse recovery in the ell-1 norm by Radu BerindePiotr Indyk and Milan Ruzic. The abstract reeads:

We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high dimensional vector x ∈ Rn from its lower-dimensional sketch Ax ∈ Rm. Specifically, we focus on the sparse recovery problem in the ℓ1 norm: for a parameter k, given the sketch Ax, compute an approximation ˆx of x such that the ℓ1 approximation error ||x − ˆx||_1 is close to min_x′ ||x − x′||_1, where x′ ranges over all vectors with at most k terms. The sparse recovery problem has been subject to extensive research over the last few years. Many solutions to this problem have been discovered, achieving different trade-offs between various attributes, such as the sketch length, encoding and recovery times. A recent paper [IR08] provided a sparse recovery scheme which achieved close to optimal performance on virtually all attributes (see Figure 1). In particular, this was the first recovery scheme that guaranteed O(k log(n/k)) sketch length, and near linear O(n log(n/k)) recovery time simultaneously. This was achieved by using sketching matrices A which were themselves very sparse. The matrix sparsity enabled decreasing the amount of computation spent on encoding and recovery. In this paper we present a new practical variant of that algorithm, that we call Sparse Matching Pursuit, or SMP. The running time of the new algorithm is slightly higher (by a logarithmic factor) than of its predecessor, and its sketch length bound remains unchanged. However, its empirical sketch length is substantially lower. This makes our scheme an attractive option for sparse recovery problems, both in theory and in practice.

Monday, October 27, 2008

CS: Alternating l1 Method Code, Expander codes, explicit Euclidean sections, and CS, a Workshop, Finding Significant Fourier Coefs, Fast Opt. Algos

Stephane Chretien just made available his code for the Alternating l1 Method for Compressed Sensing that we mentioned earlier. The page for the code is here. Thank you Stephane !

On September 22, Venkatesan Guruswami gave a talk entitled: Expander codes, explicit Euclidean sections, and compressed sensing, the summary of the talk reads:

Classical results in high-dimensional geometry from the 1970's state that a random subspace of R^N of dimension N/2 has "distortion" O(1) w.h.p where the distortion is measured by sqrt{N} times the largest ratio of the L_2 to L_1 norms of a vector in the subspace. (So O(1) distortion means each vector in the subspace has its mass spread nearly evenly over a constant fraction of the coordinates.) This result is a particular example of the use of the probabilistic method, a technique which is ubiquitous in asymptotic convex geometry. Similar randomized geometric constructions arise in areas like dimensionality reduction and compressed sensing, but it seems that such objects are very hard to come by explicitly.

We will describe constructions inspired by expander codes that can be used to produce subspaces with distortion nearly as good as a random subspace either explicitly or with sublinear randomness. Specifically, we construct an explicit subspace of R^N with dimension N(1-o(1)) and distortion (slightly worse than) poly(log N). The construction combines various ingredients such as Ramanujan graphs, sum-product theorems in finite fields, and Kerdock codes/mutually unbiased bases. We are also able to construct subspaces of R^N of proportional dimension with O(1) distortion using N^delta random bits for any constant delta \gt 0 (and hence deterministically in sub-exponential time). The talk will also briefly discuss connections to sparse signal recovery (i.e., compressed sensing).

[Based on joint works with James Lee, Alexander Razborov, and Avi Wigderson]
The Center for Computational Intractability will organize a Workshop on Geometry and Algorithms on October 29-31 at the Friend Center 06 at Princeton. The program reads:

Wednesday, Oct 29

9:00-10:10 Piotr Indyk, Survey on Compressed Sensing
10:10-10:50 break
10:50-11:25 Stephen Smale, Understanding Patterns in Data
11:25-12:00 Santosh Vempala, Isotropic Principal Components

12:00-2:00 lunch

2:00-2:35 James Lee, On the geometry of graphs and L_1 embeddings
2:35-3:10 Joel Tropp, Column subset selection, matrix factorization, and eigenvalue optimization
3:10-3:45 Assaf Naor, TBA
3:45-4:25 break
4:25-5:00 Yury Makarychev, Lower bounds for Sherali Adams via local-global metrics
5:00-5:35 Robi Krauthgamer, Metric Embeddings As Computational Primitives

Thursday, Oct 30

9:00-9:35 Guy Kindler, Can cubic tiles be sphere-like?
9:35-10:10 Leonard Schulman, Contraction and Expansion of Convex Sets
10:10-10:50 break
10:50-11:25 Venkat Guruswami, Explicit Euclidean sections from expander codes
11:25-12:00 Ben Recht, Exact Low-rank Matrix Completion via Convex Optimization

12:00-2:00 lunch

2:00-2:35 Partha Niyogi, Manifold Learning: A geometric perspective on learning theory and algorithms
2:35-3:10 Sanjoy Dasgupta, Open problems in learning low dimensional representations
3:10-3:45 Anna Gilbert, Applications of Compressed Sensing to Biology
3:45-4:25 break
4:25-5:35 Rump session: open problems, brief announcements, etc

6:15 Banquet at Palmer House

Friday, Oct 31

9:00-9:35 Yuval Rabani, Explicit construction of a small epsilon-net for linear threshold functions
9:35-10:10 Hamed Hatami, Graph norms and Sidorenko’s conjecture
10:10-10:40 break
10:40-11:00 Navin Goyal, Learning Convex Bodies is Hard (short talk)
11:00-11:35 Gideon Schechtman, TBA
11:35-12:10 Ilan Newman, Online embedding of metrics

12:10-1:10 lunch

1:10-1:45 Prasad Raghavendra, TBA
1:45-2:20 Luis Rademacher, Expanders via Random Spanning Trees
2:20-2:55 Moritz Hardt, Rounding Parallel Repetitions of Unique Games
2:55-3:30 Alexandr Andoni, Hardness of Nearest Neighbor under L_infinity

Adi Akavia, recently presented a talk at DIMACS entitled: Locally & Universally Finding Significant Fourier Coefficients. The summary of the talk reads:

Computing the Fourier transform is a basic building block used in numerous applications. For data intensive applications, even the O(N\log N) running time of the Fast Fourier Transform (FFT) algorithm may be too slow, and sub-linear running time is necessary. Clearly, outputting the entire Fourier transform in sub-linear time is infeasible, nevertheless, in many applications it suffices to find only the significant Fourier coefficients}, that is, the Fourier coefficients whose magnitude is at least fraction (say, 1%) of the sum of squared Fourier coefficients.

In this paper we present an algorithm that finds the significant Fourier coefficients of functions f over any finite abelian group G, which is: 1. Local. The running time and number of function entries read by the algorithm is polynomial in \log|G|, 1/\tau and L_1(f) (for L_1(f) denoting the sum of absolute values of the Fourier coefficients of f). 2. Universal. The same fixed set of entries is read for all functions over the same domain and with the same upper bound on their sum of Fourier coefficients. 3. Robust to noise. The algorithm finds the significant Fourier coefficients of f, even if the function entries it receives are corrupted by random noise. Furthermore, we present extensions of this algorithm to handle adversarial noise in running time sub-linear in the domain size.

Our algorithm improves on the prior universal algorithms in: (1) handling functions over any finite abelian group, (2) being robust to noise, and (3) achieving better running time in terms of L_1(f).

We present applications of our algorithm to compressed sensing, to proving cryptographic bit security results, and to efficiently decoding polynomial rate variants of Homomorphism codes, MPC codes and other exponential rate codes.

Finally, Pierre Weiss defended his dissertation on October 21. The title of his dissertation is: "Algorithmes rapides d'optimisation convexe. Applications à la restauration d'images et à la détection de changements" or "Fast Convex Optimization Algorithms: Application to Image Restoration and Change Detection". The abstract of the dissertation reads:

This PhD contains contributions in numerical analysis and in computer vision. The talk will be divided in two parts.

In the first part, we will focus on the fast resolution, using first order methods, of convex optimization problems. Those problems appear naturally in many image processing tasks like image reconstruction, compressed sensing or texture+cartoon decompositions. They are generally non differentiable or ill-conditioned. We show that they can be solved very efficiently using fine properties of the functions to be minimized. We analyze in a systematic way their convergence rate using recent results due to Y. Nesterov. To our knowledge, the proposed methods correspond to the state of the art of the first order methods.

In the second part, we will focus on the problem of change detection between two remotely sensed images taken from the same location at two different times. One of the main difficulty to solve this problem is the differences in the illumination conditions between the two shots. This leads us to study the level line illumination invariance. We completely characterize the 3D scenes which produce invariant level lines. We show that they correspond quite well to urban scenes. Then we propose a variational framework and a simple change detection algorithm which gives satisfying results both on synthetic OpenGL scenes and real Quickbird images.

Image Credit: NASA/JPL/Space Science Institute, Saturn rings taken 4 days ago.

Sunday, October 26, 2008

Trust But Verify

Two weeks ago, I was talking about the issue that in no other engineering fields, one would have been allowed to go on for so long without some sorts risk mitigation strategy. As it stands, it looks like there was none to speak about:

Some of the real engineers who design weapons of mass destruction (such is the term used by Warren Buffett about derivatives) do so knowing that they deal with extremely dangerous elements and specifically understand what chain reactions are. And when several actors are in possession of them, they generally enter in some type of risk mitigation process called treaties. Even treaties are not left to their own device. The keyword here is : Trust But Verify. In fact, much sophistication goes into the verify part. It looks as though most actors of the financial system have not gotten to that stage nor even have an understanding of the underlying physics. The words of Amicus in the comment section of Tyler Cowen's entry reinforce my view on that matter (italic and bold are mine):

While the run-up in sub-prime lending itself ($700 billion) was not desperately remarkable, it seems, he discounted the exponential growth in derivatives, many related to mortgage issuance, because the early 1990s warnings over derivatives expansion ... had proved false, to date. Sadly, making that generalization was a mistake.

His key ideological flavor, that loan officers know the risks better than any regulator ever could do, was ... interesting, but off the mark, for the current crisis. It's possible that the risks to the system, which no one party could judged themselves, arguably (knowing their part only), multiplied, rather than getting spread around.

For instance, Goldman was "into" AIG for a reported, eye-popping $20 billion (they deny that much). Did they know how much everyone else was into AIG? Only a regulator could know, or identify the weakest link in a chain, maybe. Whatever the case, by the time it came to shut down AIG, because of old-fahsioned mis-management, their counterparties appear not to have been ... fully apprised or reserved.
I am very much afraid that this crisis will put the wrong emphasis on having less financial engineering and more usual empirical laissez-faire to an already dramatic situation when in fact, we really need smart folks on the regulation side of things. Understanding derivatives may be interesting, but more challenges lie in unmasking most of the unknowns (Knowing what you don't know, where's the data Mr. Paulson).

Muthu Muthukrishnan sees a reverse auction algorithmic need in the current crisis. But the question, in my view, should be how we can algorithmically restore trust between the different actors and develop better algorithms for the regulators. For instance, the regulators need to understand some information from many different actors without each actors having the possibility of reconstructing the full picture if they, by accident, have access to their competitors information. From this data, the State or the Feds can give assurances and funnel money to the right entities while restoring some partial trust while not giving too much information to the different actors during this unfolding. In a sense, the study by Yaron Rachlin and Dror Baron on The Secrecy of Compressive Sensing Measurements provides some type of assurance that the actors would have a tough time decoding each others' information. If you think this is just talk, ask Volvo about trust, because they need some [Update: The story about Volvo has changed. Demand for Volvo Trucks seem to have fallen by 50%  in Q3, see comments in this entry ]

But what do I know, I am just a nuclear engineer who has never dealt with WMDs, except for diposing of them. Now that I think about it, even in the disposition process, trust was never a given and we always had to resort to specific technical means of enforcing that trust as should the readers of Marie Claire....

Credit: Graph taken from Greg Mankiw's blog and Marie Claire: On reflection, perhaps not from the Photoshopdisasters blog.

Saturday, October 25, 2008

Saturday Morning Cartoon: Explaining Compressed Sensing, Episode 1

[Update: For non-english speakers, it might be hard to follow some of the rapid exchange. The scenario can be found here]

I am continuing this small experiment of featuring two cartoonish characters talking about compressed sensing. The result ?....I am not quitting my day job :-) 

The scenario is based on different questions I generally get asked when grilled on the subject. If you have a friend, a colleague or a student who wants to know more about the subject but do not have time to read a paper, this might be a way to go about initiating their interest.

I am looking for ideas for the next one. If you have questions that have been asked from you several times and for which you have gotten better at answering, send me your scenario, I'll try to render it and will put it in the Saturday Morning Cartoon Series at:

In the meantime, here is this week's product.

Is it any good ?

Friday, October 24, 2008

CS: Non-Negative Matrix Factorization, Convexity and Isometry, Non-local Regularization of Inverse Problems, ETVC '08

Some of you recall that the Non-Negative Matrix Factorization (NMF) as one of the technique that could help in building dictionaries from training datasets: Gabriel Peyre, for instance, uses it to perform decomposition of textures, while Bob Plemmons applies it to hyperspectral detection of space junk (explained here). These dictionaries are then used to reconstruct signals in Compressed Sensing provided these signals have been acquired incoherently from the said dictionaries. The NMF algorithm was developed, starting with the work featured in the Nature article of Daniel Lee and Sebastian Seung entitled Learning the parts of objects by non-negative matrix factorization) and yet it never seemed to "look like" the other algorithms since as Gabriel Peyre points out:

The problem of learning D and X requires to minimize the reconstruction error ||Y − DX|| subject to the constraints D,X \ge 0. This minimization is not convex on both D and X, but following [22], an alternate minimization on each matrix can be carried using iterations of the following two steps:....An explicit sparsity prior can be added into this process [23] to enforce the constraints.... required by equation (1). In practice, this did not result in a noticeable enhancement for texture modeling.
i.e. the problem is not convex hence the use of specific iterative techniques is warranted making ithe algorithm somewhat hard to generalize. Let us note (like we did earlier) that, by some sort of miracle and as noted by Gabriel, this technique also seems to converge toward the sparser solution. Well, a new paper entitled:

Non-Negative Matrix Factorization, Convexity and Isometry by Nikolaos Vasiloglou, Alexander Gray, David Anderson features an NMF algorithm using Linear Programming techniques. This is fascinating when one knows the importance of LP in CS in the reconstruction stage. The abstract reads:
In this paper we explore avenues for improving the reliability of dimensionality reduction methods such as Non-Negative Matrix Factorization (NMF) as interpretive exploratory data analysis tools. We first explore the difficulties of the optimization problem underlying NMF, showing for the first time that non-trivial NMF solutions always exist and that the optimization problem is actually convex, by using the theory of Completely Positive Factorization. We subsequently explore four novel approaches to finding globally-optimal NMF solutions using various ideas from convex optimization. We then develop a new method, isometric NMF (isoNMF), which preserves non-negativity while also providing an isometric embedding, simultaneously achieving two properties which are helpful for interpretation. Though it results in a more difficult optimization problem, we show experimentally that the resulting method is scalable and even achieves more compact spectra than standard NMF.

Contacted on the availability of isoNMF, one of the authors, Nick tells me that:
IsoNMF along with other machine learning algorithms will be available in the Fastlib/MlPaack library that has be pre-released and it will have it's first official release by December.
I'll keep an eye on that.

Here is a fascinating paper by Gabriel Peyre, Sebastien Bougleux, Laurent Cohen entitled Non-local Regularization of Inverse Problems. The abstract reads:
This article proposes a new framework to regularize linear inverse problems using the total variation on non-local graphs. This nonlocal graph allows to adapt the penalization to the geometry of the underlying function to recover. A fast algorithm computes iteratively both the solution of the regularization process and the non-local graph adapted to this solution. We show numerical applications of this method to the resolution of image processing inverse problems such as inpainting, super-resolution and compressive sampling.

This approach is likely to be applicable beyond just image processing and for higher dimensional signals. It is fascinating for the following points:
  • instead of using the sparsity of the unknown as regularization tool, the authors use a function based on the connection between pixels of the image. Using surrounding pixels to produce a higher dimensional space and then decreasing that space is at the heart of three dimensional cue extraction from monocular views. I guess my point is that when using surrounding pixels, one is grabbing information about depth, i.e. a new dimension that can be inferred from a 2-d projection.
  • it seems to be an algorithm that should intrinsically be well suited for manifolds as diffusion method have been shown to do well in a variety of dimensionality reduction tasks (see Ronald Coifman, in Diffusion Geometries and Harmonic Analysis on Data Sets.)

I look forward to some toolbox implementing this technique.

Valerio Cambareri mentioned yesterday has made his presentation available in pdf and ppt format.

And finally, Laurent Duval mentions to me that, next month, there is a conference at Ecole Polytechnique in Paris that features some people I have mentioned in this blog before. The conference is called Emerging Trends in Visual Computing (ETVC'08) and will take place on November 18th-20th. It is in the calendar. Registration is free until the November 1st, 2008. The program list the following talks of potential interest:

  • Francis BACH (INRIA-ENS/Mines)
    Machine learning and kernel methods for computer vision
  • Frederic BARBARESCO (Thales Air Systems, France)
    Applications of information geometry to radar signal processing (while this is not related to sparse sampling or CS, I mentioned him before because of his work in norm definition on a manifold)
  • Stephane MALLAT (Ecole Polytechnique, Centre de Mathematiques Appliquees, CMAP, France)
    Sparse Geometric Super-Resolution
  • Ramesh RASKAR (MIT Media Lab, USA)
    Computational Photography: Epsilon to Coded Imaging
  • Martin VETTERLI (School of Computer and Communication Sciences & Dept. Elect. Eng. and Computer Sciences, EPFL, Switzerland)
    Sparse Sampling: Variations on a Theme by Shannon
but the other talks seem to be equally interesting, if time permits, I might drop by.

Thursday, October 23, 2008

CS: A Thesis on Analog-to-Information Converter using Random Demodulation, a talk, Jacket GPU, Inverse problems in acoustic tomography

Following up on a previous entryValerio Cambareri, a reader of this blog, just received his Bachelor of Engineering in Italy. This corresponds to a level of end of Junior year in the U.S. and Bac + 3 in France. He produced an undergraduate thesis (at the Junior level) on Compressive Sampling. The thesis is entitled:

Additional figures can be found here. It is in Italian, but if you have been reading about CS for a while, most of it is understandable. This work is based on a previous paper: Theory and Implementation of an Analog-to-Information Converter using Random Demodulation of Jason LaskaSami KirolosMarco F. DuarteTamer RaghebRichard Baraniuk and Yehia Massoud. It uses newer solvers to reconstruct the signal.

In other news:
  •  Stephane Chretien will talk about Lagrangian relaxation of the compressed sensing problem, at Boston University in the Statistics Department seminar, on November 4th, 2008 from 4 to 5 pm. The abstract of the talk reads: 
    Lagrangian relaxation is a general methodology for approximating hard combinatorial problems and has been applied with success to various instances such as the Max-Cut problem. The compressed sensing problem of recovering the sparsest solution to a system of linear equations is known to be NP-hard and is often relaxed by means of l1 minimization, i.e. a simple linear program. The goal of this talk is to present an approximate Lagrangian relaxation of the compressed sensing problem leading to a better relaxation scheme than l1 minimization. The resulting iterative algorithm is called Alternating l1. One of the standard features of the Lagrangian approach for hard combinatorial problems is to provide provably efficient approximation schemes. In the case of l1 minimization, E. Cands, T. Tao and their collaborators were able to to show that with high probability, l1 minimization does in fact better in that it recovers the solution of the original sparsest recovery problem exactly. We will show that the Alternating l1 algorithm allows to recover the sparsest solution with fewer observations than l1 norm minimization. As many recent proposals for the compressed sensing problem, an additional parameter has to be tuned in order to obtain signi cant improvements over the plain l1 strategy. A nice feature of our approach is that this additional parameter is nothing but a Lagrange multiplier and the best value is simply the one that optimizes the dual function. We will also show how to circumvent the dicult task of computing the dual optimizer exactly by proposing a meaningful value of this parameter allowing for signi cant improvements in the first two steps of the method, hence avoiding fastidious empirical tuning as is usually done in the case of other methods such as the Reweighted l1 algorithm.
  • The SLIM blog has a summary entry on Compressed sensing versus sparse recovery
  • In light of the reconstruction hardware from UCLA, I found out about Jacket: a GPU engine for Matlab. You may want to download the beta version before it becomes a paid product, and 
  • Ivana Jovanovic at EPFL just released her Ph.D thesis entitled Inverse problems in acoustic tomography, where Compressed Sensing and Finite Rate of Innovation are compared. The abstract of the thesis reads:
Acoustic tomography aims at recovering the unknown parameters that describe a field of interest by studying the physical characteristics of sound propagating through the considered field. The tomographic approach is appealing in that it is non-invasive and allows to obtain a significantly larger amount of data compared to the classical one-sensor one-measurement setup. It has, however, two major drawbacks which may limit its applicability in a practical setting: the methods by which the tomographic data are acquired and then converted to the field values are computationally intensive and often ill-conditioned. This thesis specifically addresses these two shortcomings by proposing novel acoustic tomography algorithms for signal acquisition and field reconstruction. The first part of our exposition deals with some theoretical aspects of the tomographic sampling problems and associated reconstruction schemes for scalar and vector tomography. We show that the classical time-of-flight measurements are not sufficient for full vector field reconstruction. As a solution, an additional set of measurements is proposed. The main advantage of the proposed set is that it can be directly computed from acoustic measurements. It thus avoids the need for extra measuring devices. We then describe three novel reconstruction methods that are conceptually quite different. The first one is based on quadratic optimization and does not require any a priori information. The second method builds upon the notion of sparsity in order to increase the reconstruction accuracy when little data is available. The third approach views tomographic reconstruction as a parametric estimation problem and solves it using recent sampling results on non-bandlimited signals. The proposed methods are compared and their respective advantages are outlined. The second part of our work is dedicated to the application of the proposed algorithms to three practical problems: breast cancer detection, thermal therapy monitoring, and temperature monitoring in the atmosphere. We address the problem of breast cancer detection by computing a map of sound speed in breast tissue. A noteworthy contribution of this thesis is the development of a signal processing technique that significantly reduces the artifacts that arise in very inhomogeneous and absorbent tissue. Temperature monitoring during thermal therapies is then considered. We show how some of our algorithms allow for an increased spatial resolution and propose ways to reduce the computational complexity. Finally, we demonstrate the feasibility of tomographic temperature monitoring in the atmosphere using a custom-built laboratory-scale experiment. In particular, we discuss various practical aspects
of time-of-flight measurement using cheap, off-the-shelf sensing devices.

Image Credit: NASA/JPL/Space Science Institute, W00050373.jpg was taken on October 21, 2008 and received on Earth October 22, 2008. The camera was pointing toward SATURN-RINGS at approximately 1,114,766 kilometers away.

Wednesday, October 22, 2008

CS: A singular thresholding algorithm for matrix completion, Bayesian Compressive Sensing update, Compressive Sensing Hardware

This is a first for me, the three authors of a preprint have updated their own webpage the same day of the preprint's release. Jainfeng Cai, Emmanuel Candès and Zuowei Shen just made available A singular thresholding algorithm for matrix completion. (One can also find it also here, here, or here on ArXiv). The abstract reads:

This paper introduces a novel algorithm to approximate the matrix with minimum nuclear norm among all matrices obeying a set of convex constraints. This problem may be understood as the convex relaxation of a rank minimization problem, and arises in many important applications as in the task of recovering a large matrix from a small subset of its entries (the famous Netflix problem). Off-the-shelf algorithms such as interior point methods are not directly amenable to large problems of this kind with over a million unknown entries. This paper develops a simple first-order and easy-to-implement algorithm that is extremely efficient at addressing problems in which the optimal solution has low rank. The algorithm is iterative and produces a sequence of matrices {X_k;Y_k} and at each step, mainly performs a soft-thresholding operation on the singular values of the matrix Y_k. There are two remarkable features making this attractive for low-rank matrix completion problems. The first is that the soft-thresholding operation is applied to a sparse matrix; the second is that the rank of the iterates {X_k} is empirically nondecreasing. Both these facts allow the algorithm to make use of very minimal storage space and keep the computational cost of each iteration low. On the theoretical side, we provide a convergence analysis showing that the sequence of iterates converges. On the practical side, we provide numerical examples in which 1,000 x1,000 matrices are recovered in less than a minute on a modest desktop computer. We also demonstrate that our approach is amenable to very large scale problems by recovering matrices of rank about 10 with nearly a billion unknowns from just about 0.4% of their sampled entries. Our methods are connected with the recent literature on linearized Bregman iterations for l_1 minimization, and we develop a framework in which one can understand these algorithms in terms of well-known Lagrange multiplier algorithms.

It is not strictly related to compressive sensing but improves on a previous paper and on another concept that sought to make the parallel between rank minimization and nuclear norm minimization using random linear combination of the unknown matrix (a la CS). In a similar vein, Thong Do, Yi Chen, Nam Nguyen, Lu Gan and Trac Tran recently presented a way to do this with Struturally Random Matrices. The main difference between these two latter approaches and that taken by the present paper is that in this paper by Jainfeng Cai, Emmanuel Candès and Zuowei Shen, one has a direct access to specific elements of the matrix. In the latter approaches, using either gaussian or SRM matrices, one has access to only a linear combination of the unknown elements of the matrix. In generic cases of interest, one has generally access to single elements of the otherwise unknown matrix, which makes this paper very interesting. I for one keep wondering in the back of my mind how the gaussian or SRM linear mapping could have an equivalent in the physical world.

My webcrawler noticed that in the Bayesian Compressive Sensing website. There is a mention of the fact that the BCS code was updated on Aug. 03, 2008 as a bug was fixed in MT_CS.m for the cases where signals were dramatic undersampled.

Finally, I have tried my best to make a compilation of most of the hardware implementation developed with compressive sensing in mind. It is here:

Sometimes, off-the shelf hardware can be used to do compressed sensing with no hardware change but with a different mode of operation (case of the Herschel observatory or the Chinese Chang'e probe), otherwise there is hardware developed to perform sampling in a different kind of way but is not specifically using compressed sensing ( like the hardware developed in computational photography or the coded aperture work) but should !. For the time being, I am not adding the latter equipment but soon will and I don't know how to address the former. I am also trying to categorize each of these technologies with regards to their respective Technology Readiness Level (also mentioned here in a french entry). Some way to do is by using the Excel spreadsheet of the latest TRL Calculator.

Tuesday, October 21, 2008

CS: Sparse Recovery Experiments: RIP-1 matrices, Sparse Matching Pursuit and Countmin compared to L1-magic and GPSR.

Piotr Indyk and Radu Berinde just made live a wiki on Sparse Recovery Experiments. As noted in the introduction:

This page contains the Matlab/C toolkit we used to get experimental results for sparse recovery using sparse matrices, reported in BGIKS08,BIR08 and BI08. The code provides the following functionality: 

(i) given an n-dimensional vector x, compute its m-dimensional sketch Ax, where A is an mxn sketch matrix, m much less than n, and 
(ii) given a sketch Ax, recover a k-sparse approximation x* of x. In addition, the code provides subroutines for generating test vectors x and sparse matrices A.

The toolkit supports the following recovery algorithms: Sparse Matching Pursuit (SMP)Countmin and L1 minimization (although one needs to download either l1-magic or GPSR to use the last option).

These are important experiments as they show the capability of RIP-1 measurement matrices and the attendant specific recovery algorithms (Sparse Matching Pursuit and Countmin). I am adding the link to the measurement matrix section as well as the reconstruction section of the big picture. One of the findings is that the recovery might be much faster for a dedicated matching pursuit algorithm but one needs a higher number of measurements to have the same probability of reconstruction as compared to other algorithms like GPSR and l1-magic. And so while this finding may look like a bad thing, let us also look at the positive side of things: If implemented on hardware, these sparse RIP-1 matrices would seem to provide an advantage in terms of a small dynamic range.

Monday, October 20, 2008

CS: Provide feedback to the CS part of Mallat's "A Wavelet Tour of Signal Processing"

Gabriel Peyre just released a set of matlab experiments to illustrate the 3rd edition of the "A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way" book of Stephane Mallat, which will feature chapters about sparse representations and compressive sensing.

A "beta version" of the experiments is available at:

Gabriel also mentions to me that he "will be happy to have some feedback from readers of your blog."

Saturday, October 18, 2008

A girl and a clown talk about the restricted isometry property

[ Update: the servers at seem to be very slow sometimes, I am eagerly waiting for them to allow the transfer of these movies on Youtube. For those of you reading this from an RSS feed reader, you may want to come to the site directly to see the video ].

One of the nice things we used in our DARPA car was a simple text-to-speech module so that when in the car, the computer would tell us what it was doing (as opposed to having to read that from a screen). It was very handy in the trials we went through. It looks like we can now do this for the web. I am pretty sure we can use this type of tool for instructional purposes. I, for one, recall more often discussions than items mentioned in papers. Any thoughts ?

Friday, October 17, 2008

CS: Euclidean sections of l^N_1 and EC over the reals, Post-doc, CCA, CC, Renyi Entropy, IHT, l_1 norm sparse Bayesian learning

Terry Tao mentions Compressed Sensing in his latest entry on Small samples, and the margin of error. I guess it must have to do with the concept of concentration of measure.

Venkatesan Guruswami, James R. Lee, and Avi Wigderson just released Euclidean sections of l^N_1 with sublinear randomness and error-correction over the reals. The abstract reads:

It is well-known that R^N has subspaces of dimension proportional to N on which the l_1 and l_2 norms are uniformly equivalent, but it is unknown how to construct them explicitly. We show that, for any |delta above 0, such a subspace can be generated using only N^(\delta) random bits. This improves over previous constructions of Artstein-Avidan and Milman, and of Lovett and Sodin, which require O(N logN), and O(N) random bits, respectively. Such subspaces are known to also yield error-correcting codes over the reals and compressed sensing matrices. Our subspaces are de¯ned by the kernel of a relatively sparse matrix (with at most N^(\delta) non-zero entries per row), and thus enable compressed sensing in near-linear O(N^(1+\delta) ) time. As in the work of Guruswami, Lee, and Razborov, our construction is the continuous analog of a Tanner code, and makes use of expander graphs to impose a collection of local linear constraints on vectors in the subspace. Our analysis is able to achieve uniform equivalence of the l_1 and l_2 norms (independent of the dimension). It has parallels to iterative decoding of Tanner codes, and leads to an analogous near-linear time algorithm for error-correction over reals.

The SISYPH (Signals, Systems and Physics) research group, at the Physics Department of Ecole Normale Superieure de Lyon, offers a 12-months post doctoral position, starting any time after December, 1st 2008. One of the subject of interestof the group is Compressed Sensing. More information can be found here. This job was added to the CS Job list.

Yves Lucet just released the Computation Convex Analysis Library. The succint description reads:

The Computational Convex Analysis numerical library contains efficient algorithms to manipulate convex functions, and to compute fundamental convex analysis transforms. The library is coded using Scilab version 4.1.2 (available here), a numerical software freely available.

List of functions available can be found here. Let us note that Scilab can map directly into Matlab.

On top of the abstract, Ping Li also provided a summary to the SODA conference as to how his latest Compressed Counting paper brings new insight:

Our contributions include:

1) Compressed Counting (CC) is the first proposal of using skewed projections in data stream computations. We also show that maximally-skewed projections can achieve the best performance.

2) CC is the first algorithm which captures the intuition that, when $\alpha =1$ a simple counter suffices, and when $\alpha = 1\pm\Delta$ with small $\Delta$, the sample complexity should be low. We show the sample complexity of a proposed estimator $= \frac{G}{\epsilon^2}\log\left(\frac{2}{\delta}\right)$, where $G= O\left(\epsilon\right)$ as $\Delta\rightarrow 0$. In other words, for small $\Delta$, CC has the complexity $=O\left({1}/{\epsilon}\right)$. Previous results only achieved $O\left(1/\epsilon^2\right)$ (even for $\alpha =1$), which in general is expected and can not be improved, according to the central limit theorem.

3) We show, when $\alpha = 1\pm\Delta$, the asymptotic variance of a proposed estimator is $\propto \Delta$. As $\alpha \rightarrow 1$, CC achieves ``an infinite improvement'' over the previous results, in terms of the asymptotic variances.

4) Our results (especially when $\alpha=1\pm\Delta\approx 1$) have important applications. In some situation, $\Delta$ might be interpreted as the ``decay rate'' or ``interest rate,'' which is usually small. The method of moments is popular for statistical inference, which requires computing frequency moments. Also, some important summary statistics such as Renyi entropy and Tsallis entropy are functions of frequency moments and hence CC naturally provides an efficient algorithm to compute them. Very importantly, CC may serve the basic building element for developing other algorithms, for example, approximating the Shannon entropy using Renyi entropy and Tsallis entropy with $\alpha\approx 1$.

4) Our results enrich the fundamental theory of skewed stable distributions.
Talking about Renyi entropy, Jean-Francois Berger provides us with an understanding of the Renyi entropy in 'Possible rationales for Renyi-Tsallis entropy maximization. The abstract of the paper reads:

Distributions derived from the maximization of Renyi-Tsallis entropy are often called Tsallis’ distributions. We first indicate that these distributions can arise as mixtures, and can be interpreted as the solution of a standard maximum entropy problem with fluctuating constraints. Considering that Tsallis’ distributions appear for systems with displaced or fluctuating equilibriums, we show that they can be derived in a standard maximum entropy setting, taking into account a constraint that displace the standard equilibrium and introduce a balance between several distributions. In this setting, the Renyi entropy arises as the underlying entropy.
Another interest of Tsallis distributions, in many physical systems, is that they can exhibit heavy-tails and model power-law phenomena. We note that Tsallis’ distributions are similar to Generalized Pareto distributions, which are widely used for modeling the tail of distributions, and appear as the limit distribution of excesses over a threshold. This suggests that they can arise in many contexts if the system at hand or the measurement device introduces some threshold. We draw a possible asymptotic connection with the solution of maximum entropy. This view gives a possible interpretation for the ubiquity of Tsallis’ (GPD) distributions in applications and an argument in support to the use of Renyi-Tsallis entropies.

Thomas Blumensath and Mike Davies have just released their ICASSP '09 talk featuring their Iterated Hard Thresholding solver in A Simple, Efficient and Near Optimal Algorithm for Compressed Sensing by Thomas Blumensath and Mike Davies. The abstract reads:
When sampling signals below the Nyquist rate, efficient and accurate reconstruction is nevertheless possible, whenever the sampling system is well behaved and the signal is well approximated by a sparse vector. This statement has been formalised in the recently developed theory of compressed sensing, which developed conditions on the sampling system and proved the performance of several efficient algorithms for signal reconstruction under these conditions. In this paper, we prove that a very simple and efficient algorithm, known as Iterative Hard Thresholding, has near optimal performance guarantees rivalling those derived for other state of the art approaches.

Finally, the Ph.D thesis of Yuanqing Lin entitled l(1)-norm sparse Bayesian learning: Theory and applications is now available.

Image Credit: NASA/JPL/Space Science Institute, view of Enceladus from the Cassini spacecraft, photo taken on october 9th, 2008.

Thursday, October 16, 2008

CS: PPPA, A Predual Proximal Point Algorithm solving a Non Negative Basis Pursuit Denoising model

François Malgouyres and Tieyong Zeng, have released a newer version of A Predual Proximal Point Algorithm solving a Non Negative Basis Pursuit Denoising model. The abstract reads:
This paper develops an implementation of a Predual Proximal Point Algorithm (PPPA) solving a Non Negative Basis Pursuit Denoising model. The model imposes a constraint on the l2 norm of the residual, instead of penalizing it. The PPPA solves the predual of the problem with a Proximal Point Algorithm (PPA). Moreover, the minimization that needs to be performed at each iteration of PPA is solved with a dual method. We can prove that these dual variables converge to a solution of the initial problem. Our analysis proves that we turn a constrained non differentiable convex problem into a short sequence of nice concave maximization problems. By nice, we mean that the functions which are maximized are differentiable and their gradient is Lipschitz. The algorithm is easy to implement, easier to tune and more general than the algorithms found in the literature. In particular, it can be applied to the Basis Pursuit Denoising (BPDN) and the Non Negative Basis Pursuit Denoising (NNBPDN) and it does not make any assumption on the dictionary. We prove its convergence to the set of solutions of the model and provide some convergence rates. Experiments on image approximation show that the performances of the PPPA are at the current state of the art for the BPDN.

They also have made available the code here. It features Matlab codes, scripts and results (30 MB). Scripts are adapted to the SPARCO toolbox so that it can be compared to other algorithms (GPSR, PCD, IT). This is a great initiative that deserves much praise.