Friday, November 28, 2008

CS: Compressive Sampling of Non-Negative Signals, A Question, a meeting and two comments.

When reference [2] came out a year and eight days ago, we hinted on the subject. More recently, a month ago, we saw a connection between Linear Programming and Non-Negative Matrix Factorization (NMF). In that entry, it was also recalled some of the empirical findings of folks like Gabriel Peyre i.e. that NMF did not seem to need a sparseness constraint to yield a sparse solution. Well it looks like Paul O’Grady and Scott Rickard are making the connection between CS and NMF more obvious in: Compressive Sampling of Non-Negative Signals. The abstract reads:
Traditional Nyquist-Shannon sampling dictates that a continuous time signal be sampled at twice its bandwidth to achieve perfect recovery. However, It has been recently demonstrated that by exploiting the structure of the signal, it is possible to sample a signal below the Nyquist rate and achieve perfect reconstruction using a random projection, sparse representation and an l_1-norm minimisation. These methods constitute a new and emerging theory known as Compressive Sampling (or Compressed sensing). Here, we apply Compressive Sampling to non-negative signals, and propose an algorithm—Non-negative Underdetermined Iteratively Reweighted Least Squares (NUIRLS) —for signal recovery. NUIRLS is derived within the framework of Non-negative Matrix Factorisation (NMF) and utilises Iteratively Reweighted Least Squares as its objective, recovering non-negative minimum l_p-norm solutions, 0  p  1. We demonstrate that—for sufficiently sparse non-negative signals—the signals recovered by NUIRLS and NMF are essentially the same, which suggests that a non-negativity constraint is enough to recover sufficiently sparse signals.

The associated talk is here. Following up on an e-mail exchange, Paul  also made his code available and added:
...I have placed the code for NUIRLS on my website at:

The main aim of my MLSP08 paper is to demonstrate that sparse norms are not always necessary for the special case where CS is entirely non-negative. Furthermore, in December I will present a paper [1] at a workshop in England where NMF is performed in the CS domain---recovering sparse coefficients AND a sparse basis from a compressively sampled matrix. 
I can't wait to see the paper [1]. I then replied asking him:
Is it your experience that NUIRLS is fast ?
Paul kindly responded with:
My experience with standard NMF is that it is fast, further extensions such as adding sparseness constraints to convolutive models make it slow, when you include reweighting into the objective---as is the case for NUIRLS---it becomes very very slow. The graphs in the MLSP paper took a number of weeks. luckily though, we can use standard NMF updates if the signal is sufficiently sparse.
Much of the geometric argument [3] has the positivity constraint built into it and it looks to me that the general use of the argument based on the Restricted Isometry Property (RIP) is there to avoid the positivity argument of the signal. We also know that RIP, on top of being difficult to prove, is not a tight constraint. We also know that the positivity constraint can be shown to be a powerful one [5] and that the geometric argument can be expanded to non-sparse signals [4]. So my question is:

 Is CS prominently about Positivity or Sparsity ?

If it is about positivity, would message passing algorithms be faster than current algorithms ?

The Workshop in England is near Oxford and is hosted by the IMA. It will feature many talks on CS. I have added the date to the calendar. One can register for the meeting until December 8th. I am adding the link to the NUIRLS algorithm to the Big Picture section on code reconstruction.

About yesterday's entry, Piotr Indyk made the following comment on my characterization of their work/table:

A quick comment: Anna's table actually contains *all* best known (to us) bounds for sparse recovery that work for arbitrary signals. I.e., it is not limited to schemes based on RIP-1 principle, but it also contains schemes based on usual RIP and Euclidean width properties. You can tell that by the last column (RIP-1 property leads to approximation bounds in the l1 norm, while other properties yield mixed norm or l2 norm bounds).
Thanks Piotr for the correction!

In this entry, I mentioned a dimensionality reduction toolbox with many different techniques created by Laurens van der Maaten. Alas, the link is not working anymore. Taraz Buck noted the following:

The license allows for it, so I've posted a copy of v0.4b on my site:

Taraz also added on his site:
The license allows for redistribution non-commercially, as you can see in the file's includedReadme.txt. If you have a newer version or his site comes back up, please let me know.
Thank you Taraz!

[1] Paul D. O'Grady and Scott T. Rickard. Non-negative Matrix Factorisation of Compressively Sampled Non-Negative Signals. In Proceedings of the 8th IMA International Conference on Mathematics in Signal Processing , December 2008, Cirencester UK.

Thursday, November 27, 2008

CS: Compressive Sensing Workshop, ANR Memoire, CS of a phantom, Counter Braids on FPGAs, l_0 reconstruction, RIP-1 table, two talks and some humor

One could only wish Louis Armstrong was right.

From the Rice repositoryLarry Carin and the other folks at Duke and AFRL are organizing a Compressive Sensing Workshop:
Compressive-Sensing Workshop

February 25 & 26, 2009
Location: Duke University
Sponsored by the AFRL ATR Center
Co-sponsored by AFOSR, ARO, DARPA, NGA and ONR

This workshop will bring together leading experts in the new field of compressive sensing (CS). The meeting will focus on new theory, algorithms and applications of CS. The format of the meeting will include formal talks, panel discussions and group interaction, as well as poster sessions. In addition to having talks from many of the leading CS researchers from academia, there will be talks from the members of the US government and industry, on directions for this new technology, including future research directions and programs.

The meeting is jointly organized by Lawrence Carin (Duke) and Gregory Arnold (AFRL), who are affiliated with the AFRL ATR Center in Dayton, Ohio.

The local Duke hosts for the workshop are Lawrence Carin, David Brady, Mauro Maggioni, Xiaobai Sun, and Rebecca Willet.

The following members of the academic community have accepted invitations to present talks:

  • Rich Baraniuk, Rice
  • Thomas Blumensath, University of Edinburgh (England)
  • Rob Calderbank, Princeton
  • Ronald Coifman, Yale
  • Ronald Devore, U South Carolina
  • Yonina Eldar, Technion (Israel)
  • Donald Goldfarb, Columbia
  • Mario Figueiredo, Instituto Superior Tecnico (Portugal)
  • Kevin Kelly, Rice
  • Jian Li, U Florida
  • James McClellan, Georgia Tech
  • Kevin Murphy, UBC (Canada)
  • Mark Neifeld, U Arizona
  • Rob Nowak, U Wisconsin
  • Stanley Osher, UCLA
  • George Papanicolaou, Stanford
  • Lee Potter, Ohio State
  • Justin Romberg, Georgia Tech
  • Guillermo Sapiro, U Minnesota
  • Matthias Seeger, Max Planck Institute for Informatics (Germany)
  • Joel Tropp, Caltech
  • Martin Wainwright, UC Berkeley
  • David Wipf, UC San Francisco
There will also be talks from members of the US government, including DARPA and NGA, as well as talks from members of industry.

Participation: The meeting is open to all who wish to participate. If interested in being considered for a contributed talk or poster, please contact Lawrence Carin, at

Student presentations: There will be a session dedicated to talks by graduate students. The presentation deemed to be of highest technical content, and presented best, will receive a $500 prize. Please contact Lawrence Carin at if you would like to participate in the grad-student session.

Schedule: The detailed agenda is being finalized, and will be posted soon. Talks will start at 8am on February 25, and end on February 26 at 5pm. The hotel room block is for the evenings of February 24 and 25, and the hotel is offering a special rate for February 26 for those who wish to stay an additional night.

Registration: There will be a $250 registration fee, which covers all meeting participation, as well as breakfast and lunch on Feb. 25 and 26, and dinner on Feb. 25. The registration fee will be collected at the meeting. For members of the AFRL ATR Center as well as graduate students with valid student ID, the registration fee will be half-priced.

Location and housing: The meeting will be held at the Washington Duke Inn and Golf Club. Room reservations may be made by calling 800-443-3853; when making reservations, mention the group name "AFRL" to get the special rates that have been negotiated. For government personnel, the government per-diem rate of $95/night will be honored; for non-government guests the rate is $125/night. To get these rates, reservations must be completed by February 6, 2009. We have negotiated a very special rate for this meeting, so please make reservations by this date; nightly rates are much higher after that date.

Banquet: A dinner will be held on February 25, with this included in the registration fee. There will be a performance by The Pitchforks.

Travel: The Raleigh-Durham airport (RDU) is a short distance from Duke University. It is recommended that guests take a cab ride from the airport, as there will be no need for transportation on the Duke campus
It was added to the calendar.

The French Agence Nationale de la Recherche just came out with a new RFP entitled DOMAINES ÉMERGENTS ET PROGRAMME PHARE «MEMOIRE». I  note the mention of Compressed Sensing:

Une problématique liée à la fusion et la représentation de grand volume de données ainsi qu’à la possibilité d’acquérir des données de plus en plus résolues, doivent conduire à repenser la problématique de l’échantillonnage ainsi que celle de la compression. De nouvelles voies, à l’image des techniques de compressed sensing doivent être proposées et développées.
Proposals are due February 26th, 2009. More information can be found here.

Thales has two internships for people coming from french engineering schools in the area of compressed sensing. I hesitate to put them in the jobs section as it seems to be very narrow in terms of the type of people they are looking for. The announcements are here and here.

Jon Dattorro tells me that the article mentioned here  is now available at:

on the right hand side of the side in Revision History, one can see: Latest electronic version: v2008.11.03 (756 pages). Click on this document and search for "Compressive sampling of a phantom". The associated Matlab code is here.

For those of you not reading the blog directly, I have updated yesterday's entry to reflect on an inaccurate statement I made on Leo Grady's upcoming presentation. Thanks Leo for the correction!

While we are on the subject of l_0 reconstruction, I noticed an updated presentation in wikimization by Christine Law and Gary Glover entitled Toward 0-norm Reconstruction, and a Nullspace Technique for Compressive Sampling. Since the earlier version, I note a fuller presentation and the appearance of a reference to Stephane Chretien's work.

Also in the same wikimization page, I note an expanded version of Anna Gilbert's talk entitled Combining Geometry and Combinatorics: A Unified Approach to Sparse Signal Recovery. Namely the famous table showing the progress over time of the matrices found the combinatorial approach:

As a visual person, I like simpleton color coding scheme. Thanks Anna! There is also more in the application side of the presentation.

With the recent mention of Counter Braids and their connection to CS, I forgot to mention its implementation on hardware in Prototyping Counter Braids on NetFPGA by Jianying Luo, Yi Lu  and Balaji Prabhakar. The abstract reads:

We recently proposed Counter Braids, an SRAM-only counter architecture for high-speed per-flow counting. Accurate per-flow counting was deemed complex and expensive because of the need for large arrays of counters operating at very high link speed (e.g. 40 Gbps). A naive algorithm needs an infeasible amount of SRAM to store both the counters and a flow-to-counter association rule, so that arriving packets can update corresponding counters at link speed. Counter Braids avoids the storage of flow-to-counter association by using random graphs generated on-the fly with hash functions. The counts of all flows remain compressed in SRAM at all times and incoming packets update directly the compressed counts, that is, no decompressing is necessary for updates. The compressed counts are transferred to software at the end of a measurement epoch and almost all flows are recovered exactly with a message passing decoding algorithm. One significant advantage of Counter Braids is the ease of implementation in hardware and the simplicity of updates. By prototyping Counter Braids on a NetFPGA board, we prove these claims in this paper. This particular implementation of Counter Braids achieves a 30-fold reduction in space compared to a naive hash-table based implementation.
It looks we may just have to wait for three years before we have those on our desk as new FPGAs are coming out with some new technology developement promising to make them cheaper (Memristors Make Chips Cheaper).

And finally, two talks and some humor:

Midwest Theory Day,  Dec. 6, 2008, 10am-5:00pm, Fine Barr Forum, Allen Center, Northwestern U., Evanston IL.Guangwu Xu (U. of Wiscconsin, Milwaukee), "Restricted Isometry Properties and Compressed Sensing":
In this talk, I will discuss constrained $\ell_1$ minimization methods in a unified framework for the recovery of high dimensional sparse signals in three settings: noiseless, bounded error and Gaussian noise. I will describe some improvements to the existing results in the literature by weakening the RIP conditions and tightening the error bounds. The improvement on the conditions shows that signals with larger support can be recovered accurately. I will also discuss connections between restricted isometry property and the mutual incoherence property, and present some extensions to the results of Candes, Romberg and Tao, and Donoho, Elad, and Temlyakov.
Joint work with Tony Cai and Jun Zhang

In many engineering applications, notions such as order or dimension of a model can be expressed as the rank of an appropriate matrix. To choose simple models, we seek low-rank matrices. The matrix rank minimization problem arises in linear system identification, statistical modeling of random processes, and data dimensionality reduction.

The rank minimization problem is known to be computationally intractable. 
This talk discusses a convex relaxation based on semidefinite programming, and presents recent results in understanding under what conditions the relaxation is guaranteed to work. We discuss connections with the field of Compressed Sensing, which has demonstrated that sparse signals and images can be recovered from apparently incomplete sets of measurements. We ask what other objects or structures might also be recovered from incomplete, limited measurements, and present results on recovering matrices from partial information. Such problems are generally underdetermined, but when the matrix is known to be low rank or approximately low rank, the search for solutions becomes feasible. Applications arise in machine learning, system identification, and signal processing.

Credit video: Igor Carron. Credit Song: Louis Armstrong.

Wednesday, November 26, 2008

CS: Fixed point and Bregman iterative methods for matrix rank minimization, A Fast Posterior Update for Sparse Undetermined Linear Models, Three talks

Aleks Jakulin alerts us to the fact ( in Netflix Prize scoring function isn't Bayesian ) that with examples like Napoleon Dynamite, the grading system used by Netflix is throwing all the nice recommending systems off. This doesn't stop Shiqian Ma, Donald Goldfarb, Lifeng Chen from trying to improve the current approaches to nuclear norm minimization instead of rank minimization in the same spirit as minimizing the l_1 norm instead of the l_o norm in traditional compressed sensing. Their paper is entitled: Fixed point and Bregman iterative methods for matrix rank minimization. The abstract reads:
The linearly constrained matrix rank minimization problem is widely applicable in many fields such as control, signal processing and system identification. The linearly constrained nuclear norm minimization is a convex relaxation of this problem. Although it can be cast as a semidefinite programming problem, the nuclear norm minimization problem is expensive to solve when the matrices are large. In this paper, we propose fixed point and Bregman iterative algorithms for solving the nuclear norm minimization problem and prove convergence of the first of these algorithms. By using a homotopy approach together with an approximate singular value decomposition procedure, we get a very fast, robust and powerful algorithm that can solve very large matrix rank minimization problems. Our numerical results on randomly generated and real matrix completion problems demonstrate that this algorithm is much faster and provides much better recoverability than semidefinite programming solvers such as SDPT3.

Lee PotterPhil Schniter, and Justin Ziniel just released A Fast Posterior Update for Sparse Undetermined Linear Models. The abstract reads:
A Bayesian approach is adopted for linear regression, and a fast algorithm is given for updating posterior probabilities. Emphasis is given to the underdetermined and sparse case, i.e., fewer observations than regression coefficients and the belief that only a few regression coefficients are non-zero. The fast update allows for a low-complexity method of reporting a set of models with high posterior probability and their exact posterior odds. As a byproduct, this Bayesian model averaged approach yields the minimum mean squared error estimate of unknown coefficients. Algorithm complexity is linear in the number of unknown coefficients, the number of observations and the number of nonzero coefficients. For the case in which hyperparameters are unknown, a maximum likelihood estimate is found by a generalized expectation maximization algorithm.

We also have three talks on CS in the coming week:

Terry Tao will give a lecture in Norway entitled "Compressed sensing", in Auditorium EL2, Electrical Engineering Building , NTNU Gløshaugen, at 11:15–12:00 on Tuesday, December 9th. 

Applied Mathematics Colloquium at Caltech, Monday December 1, 2008, 4:00 PM, 101 Guggenheim Lab, Lees-Kubota Lecture Hall, Title: New, Fast and Effective Algorithms for Imaging, Compressed Sensing and Related Problems, with Applications by Stanley Osher, UCLA

and finally,  

Leo Grady, Senior Research Scientist at Siemens Corporate Research will talk at MIT on Wed 12/3 at 4pm, Room 3189 (McGovern Institute Seminar Room), MIT Building 46 (Brain and Cognitive Sciences)

An ideal image segmentation algorithm could be applied to the segmentation of objects (e.g., tissue classes) without any adjustment for image acquisition device or application. However, a general-purpose, multiway segmentation of objects in an image/volume remains a challenging problem. In this talk, I will describe a recently developed approach to this problem this has been successful in several Siemens products. This segmentation approach inputs a few labeled points from a user (e.g., from mouse clicks) and produces a segmentation by computing the probabilities that a random walker leaving unlabeled pixels/voxels will first strike the labeled set. These probabilities may be computed analytically and deterministically by noting the exact mathematical equivalence with a combinatorial Laplace equation. The solution of the combinatorial Laplace equation admits interpretation of the algorithm as a steady-state electrical circuit simulation or as a minimization of the Dirichlet energy. Going beyond the segmentation problem, we may employ this energy minimization interpretation to minimize the p-norm of the spatial gradient of any function over neighboring pixels. By setting p=0 (i.e., maximizing sparsity), I will
show how we can also apply this energy minimization algorithm to image reconstruction for compressed sensing. Although compressed sensing and image segmentation appear to be very different problems, the conclusion of this work is that a common energy minimization approach may used to produce very good results for both problems.
I like the fact that there is a mention of an actual implementation of an l_1 minimzation in actual working products. 
[ Update: Leo just corrected me, the inclusion in Hardware is not the CS part. Furthermore, p=0 is a give away that says he is more interested in solving an l_0 problem than an l_1 one. I screwed up on this one! ]

These events have been added to the calendar.

Credit: NASA/JPL/Space Science Institute, Prometheus Brings Change to the F Ring

Monday, November 24, 2008

CS: Computing Verifiable Sufficient Conditions for Sparse Signal Recovery via l_1 Minimization.

Anatoli Juditsky  and  Arkadii Nemirovski have just released the code associated with their paper entitled: On Verifiable Sufficient Conditions for Sparse Signal Recovery via l_1 Minimization. To remind you what this is about, let me quote again the abstract of the paper: 
We propose novel necessary and sufficient conditions for a sensing matrix to be "$s$-good" -- to allow for exact $\ell_1$-recovery of sparse signals with $s$ nonzero entries when no measurement noise is present. Then we express the error bounds for imperfect $\ell_1$-recovery (nonzero measurement noise, nearly $s$-sparse signal, near-optimal solution of the optimization problem yielding the $\ell_1$-recovery) in terms of the characteristics underlying these conditions. Further, we demonstrate (and this is the principal result of the paper) that these characteristics, although difficult to evaluate, lead to verifiable sufficient conditions for exact sparse $\ell_1$-recovery and to efficiently computable upper bounds on those $s$ for which a given sensing matrix is $s$-good. We establish also instructive links between our approach and the basic concepts of the Compressed Sensing theory, like Restricted Isometry or Restricted Eigenvalue properties.
 The code is located here:

Arkadii Nemirovski also tells me that the code doesn't rely on the MOSEK commercial code anymore.

I note that they mention the following with regards to the other computational approach of  Alexandre d'Aspremont and Laurent El Ghaoui (Testing the Nullspace Property using Semidefinite Programming):
Besides, [11] proposes an efficiently computable upper bound on γ^_s(A) based on semidefinite relaxation; this bound is essentially different from our, and it could be interesting to find out if one of these bounds is “stronger” than the other.
First of all, let's call γ^_s(A, \beta)  the Juditsky-Nemirovski constant or JN constant in order to differentiate it from the Restricted Isometry Constant. Second, making their code available is sure to provide much in the way of comparing these bounds in the future. 

Why is this matter important to the Compressive Sensing community ? 

As explained earlier, currently much of the compressive sensing hardware development is "contrained" in trying to map the hardware acquisition process into measurement matrices that are known to fullfill the Restricted Isometry Property (take for instance Roummel Marcia and Rebecca Willett's coded aperture architecture in Compressive Coded Aperture Superresolution Image Reconstruction - additional material can be found here while the slides are here -). And so while it is a very interesting exercise on its own, it could be argued that in order to be more universal, we need to have tools for people to check if their own measurement matrices or multiplexing algorithms or PDE discretization schemes have the possibility of reconstructing a sparse solution using Linear Programming techniques. In other words, we now have a new tool for evaluating underdetermined systems that arise in many areas of science and engineering. 

Credit: NASA/JPL-Caltech/University of Arizona/Texas A&M, Last photo taken by Phoenix on Mars.

Sunday, November 23, 2008

Is this Spam 2.0 ?

For the past week or so I have gotten two requests from people who want to guest blog even though they really did not have much expertise on the subject areas covered here, except if you consider that I write entries on Health/Medical/Fitness. I have also noticed an increased amount of traffic from two IP addresses:

As you can see, the number of visits are not affected but the number of pageviews are. These two IP addresses and the bots behind them are behaving quite differently.

The first bot comes from: (Limelight Networks Inc) 
and it is based in Tempe, Arizona but sometimes it is doing the same thing from an IP near D.C. The bot seems to go to only one address:

It is baffling. If it were intelligent, one would expect it to go beyond that address but it doesn't as if it were expecting a new item for this posting. The bot comes back several times a day (maybe five to ten times a day). Yesterday, it came at the following hours:


The second bot is the one causing this past week's spike in page views. The bot's IP is:

80.56.211.#  or (Cpe Customers Nl) in Zuid-holland, Rotterdam, Netherlands.
The bot comes back every 65 seconds for periods of hours and goes to the main blog address. If you are using a bot to check on any new item on this blog, please change its setting to do so only once or twice a day (like the one I use), not a thousand times. Then again, I am expecting it to be a spam bot. 

But what is the purpose of these spam bots ? I don't get it.

Friday, November 21, 2008

CS: The Design of Compressive Sensing Filter: Two possible hardware implementations and maybe many more.

I had a great conversation the day before yesterday with Frank Nielsen and Ramesh Raskar. More on that later, in the meantime, I was also directed to the new arxiv preprint entitled :

The Design of Compressive Sensing Filter by Lianlin Li, Wenji Zhang, Yin Xiang, Fang Li. The abstract of the paper reads:

In this paper, the design of universal compressive sensing filter based on normal filters including the lowpass, highpass, bandpass, and bandstop filters with different cutoff frequencies (or bandwidth) has been developed to enable signal acquisition with sub-Nyquist sampling. Moreover, to control flexibly the size and the coherence of the compressive sensing filter, as an example, the microstrip filter based on defected ground structure (DGS) has been employed to realize the compressive sensing filter. Of course, the compressive sensing filter also can be constructed along the identical idea by many other structures, for example, the man-made electromagnetic materials, the plasma with different electron density, and so on. By the proposed architecture, the n-dimensional signals of S-sparse in arbitrary orthogonal frame can be exactly reconstructed with measurements on the order of Slog(n) with overwhelming probability, which is consistent with the bonds estimated by theoretical analysis.
I understand from talking to Lianlin Li that there will be several iterations of this very interesting manuscript. The technique itself reminds me very much of an early paper by Joel Tropp, Michael Wakin, Marco Duarte, Dror Baron, and Richard Baraniuk entitled Random Filters For Compressive Sampling And Reconstruction. The reason that paper got my interest was when I first read the following sentence:
At first glance, one might think this method would convert a signal into garbage.
You don't often see that type of assessment in publications :-)

The paper by Lianlin Li also reminds me of a previous paper Vladimir Ignatovich and I wrote on multilayer systems a while back. Maybe I should look into it....hmmm...Anyway, I very much like the fact that they mention that..
..the ionosphere can be looked as the natural compressive sensing measurement system.
If there is a way to figure out the layering in the atmosphere then I think it would end up being a very nice remote sensing mechanism not unlike what Ivana Jovanovic does in her Ph.D thesis entitled Inverse problems in acoustic tomography (mentioned here).

Thursday, November 20, 2008

CS: GPUCamp, a job, Super-Resolution using Finite Rate of Innovation.

Matthias Seeger has an opening for a PhD Student / Postdoctoral Researcher in Machine Learning, Image Processing

Probabilistic Machine Learning and Medical Image Processing, Saarland University, Saarbruecken, Germany

Fully-funded PhD/postdoc positions are available in the recently established Probabilistic Machine Learning group headed by Matthias Seeger (PhD). PhD training is conditional on acceptance to the International Max Planck Research School for Computer Science (based on evaluation of research proposal and oral presentation, after first six months).

Recent breakthroughs in large-scale approximate Bayesian inference for sparse continuous variable models allow nonlinear Bayesian experimental design (active learning) and compressed sensing to be applied to sampling optimization of magnetic resonance imaging. Details about these projects.

Saarland University is among the leading computer science faculties in Europe, with world-class groups in computer graphics, theory of algorithms and programming languages, theoretical CS, and bioinformatics, among others. It features a unique accumulation of top-ranked CS research institutes (Max Planck Institute for Informatik, Max Planck Institute for Software Systems, DFKI). Within the recently established interdisciplinary MMCI Cluster of Excellence, 20 independent research groups are working in areas with strong overlaps to core machine learning application areas. Saarbruecken is dedicated to excellent postgraduate education, structured according to international standards in the International Max Planck Research School for Computer Science (courses taught in english).

The Probabilistic Machine Learning group focusses on theory and applications of approximate Bayesian inference, and its scalable reduction to standard methods of scientific computing (numerical mathematics, efficient algorithms, signal processing, parallel processing). We closely collaborate with the Center for High-field Magnetic Resonance, Max Planck Institute for Biological Cybernetics, Tuebingen (with a range of MR scanners dedicated to basic research), and have close ties to the Empirical Inference group (headed by Bernhard Schoelkopf) at the same institute, beyond close connections to top machine learning groups in the UK and US.

We are looking for highly motivated, research-oriented individuals with an excellent grasp of the mathematics underlying approximate Bayesian inference, or/and numerical optimization and mathematics, or/and image and signal processing. A strong theoretical background in a field relevant to analysis of statistical methods, or/and keen interest and capabilities in large-scale scientific programming are required.

Please be sure to include the following in your application:

  • Curriculum vitae
  • Statement of research interests (1 page)
  • Letters of reference (1-3) from referees able to comment on your work and academic standing (PhD/MSc thesis advisor, supervisor for internships)
  • Sample of your strongest work (first-author paper in peer-reviewed journal/conference, MSc or PhD thesis, term project paper (with official record attesting your authorship)) in the rough area of interest
  • Transcript of studies (for PhD applicants)

Applications should be sent by e-mail to Matthias Seeger. If you happen to attend the forthcoming Neural Information Processing Systems conference, please make yourself known to me there.

Matthias Seeger goes further in the description of his projects, please note the question in the compressive sensing section:

Currently, a number of PhD/postdoc positions are available for exceptional candidates with strong interests in applications of approximate Bayesian inference. Aspects on several levels of these projects lead into previously rather unchartered terrain, with much original work still to be done:

  • Supporting MRI sequence design by Machine Learning
    Some ML work has been done on analyzing MR images (mostly fMRI) after they have been recorded, or on denoising images. Our interest is in supporting decisions about how images are measured in the first place. On the other hand, we will also identify and address applications of Machine Learning and Bayesian techniques to MRI problems other than sequence design (for example, estimation and compensation of gradient or main field inhomogeneities, robust phase difference estimation, combination of measurements from coil arrays, or motion correction).
  • Nonlinear (Bayesian) Experimental Design
    Bayesian (or classical) ED for Gaussian MRFs is well-developed, but we are not aware of work on the scale of interest here for non-Gaussian models, which are much more useful in practice for a number of reasons. Moreover, there seem to be no theoretical results about Bayesian ED with variational posterior approximations.
  • Compressed sensing for images (that really works)
    Compressed sensing is very fashionable right now in signal processing, and MRI is often cited as a useful application. However, as noted above, present theory has little relevance for the practical problem. What are the relevant properties of a good (or optimal) design in the context of sparse reconstruction of real MR images?
  • Large scale approximate inference for sparse generalized linear models
    Bayesian ED needs inference, beyond penalized optimization for the posterior mode. This seems more difficult, and is almost untouched by theoretical statistics so far (while more and more results about sparse penalized estimation are obtained). Our relaxation is provably convex, inviting theoretical analysis in principle.
    Moreover, our novel algorithms, orders of magnitude faster than previous methods for inference over images, can be applied to generalized linear models as well, opening up a host of new applications of Bayesian inference beyond MAP estimation.

The annoucement will be added to the Compressive Sensing Job section.

and also I found two papers using Finite Rate of Innovation:

Exact Feature Extraction using Finite Rate of Innovation Principles with an Application to Image Super-resolution by Loic Baboulaz and Pier Luigi Dragotti. The abstract reads:

The accurate registration of multiview images is of central importance in many advanced image processing applications. Image super-resolution, for example, is a typical application where the quality of the super-resolved image is degrading as registration errors increase. Popular registration methods are often based on features extracted from the acquired images. The accuracy of the registration is in this case directly related to the number of extracted features and to the precision at which the features are located: images are best registered when many features are found with a good precision. However, in low-resolution images, only a few features can be extracted and often with a poor precision. By taking a sampling perspective, we propose in this paper new methods for extracting features in low resolution images in order to develop efficient registration techniques. We consider in particular the sampling theory of signals with infinite rate of innovation [10] and show that some features of interest for registration can be retrieved perfectly in this framework, thus allowing an exact registration. We also demonstrate through simulations that the sampling model which enables the use of infinite rate of innovation principles is well-suited for modeling the acquisition of images by a camera. Simulations of image registration and image super-resolution of artificially sampled images are first presented, analyzed and compared to traditional techniques. We finally present favorable experimental results of super-resolution of real images acquired by a digital camera available on the market.

Some videos can be found on Loic Baboulaz's page.

Looking back at the images we took from a NASA stratospheric balloons, it looks like I do not have a high concentration of images for this implementation (40). All the panoramas are listed here, original photos from the flight can be found here. The camera worked for several hours but we were limited by the 4GB SD card. More information can be found here and here. I launched this student project as a direct response to the deplorable rescue effort that occured after Katrina. All these photos are freely available for all your benchmarks with attribution.

Geometry-Driven Distributed Compression of the Plenoptic Function: Performance Bounds and Constructive Algorithms by Nicolas Gehrig and Pier Luigi Dragotti. The abstract reads:

In this paper, we study the sampling and the distributed compression of the data acquired by a camera sensor network. The effective design of these sampling and compression schemes requires, however, the understanding of the structure of the acquired data. To this end, we show that the a-priori knowledge of the configuration of the camera sensor network can lead to an effective estimation of such structure and to the design of effective distributed compression algorithms. For idealized scenarios, we derive the fundamental performance bounds of a camera sensor network and clarify the connection between sampling and distributed compression. We then present a distributed compression algorithm that takes advantage of the structure of the data and that outperforms independent compression algorithms on real multi-view images.

Credit Photo: Franky De La Garza, Jay Gerber, Sara Guest, Raymond Mendoza, Ramon Rivera, Karen Villatoro, Pamela Withrow, John Yezak, Igor Carron, Pedro Davalos.

Wednesday, November 19, 2008

CS: The DUMBEST algorithm, Mindmaps, Counter Braids, two talks

While reflecting on an entry by Andres Corrada-Emmanuel entitled Guaranteeing Accuracy and Precision, it brought back to memory an old and dumb idea of what I would call the "DUMB comprEssive Sensing reconsTruction Algorithm" or the DUMBEST algorithm for short. The idea is that the minimum number of projections is only known asymptotically. However, in practice, let's say one has 150 such random projections and no satisfactory reconstruction: what happens when you want to reconstruct your initial signal with another projection (making the total number of projections 151). It so happens that there is the possibility that reconstruction could be perfect with only 150 measurements given that one uses the new projection and 149 others taken from the initial set of 150. The point of the DUMBEST algorithm would be that everytime a new projection is added to the set of measurements of size n, one would try to reconstruct a combinatorial number of measurements (n n-1) = n using the new projection and removing one previously belonging in the initial set. One could even go further in the recursion. One could also perform these reconstructions using different solvers and use a voting mechanism to single out the actual reconstructed signal. I realize it really is a dumb algorithm, but in some cases, even a compressive sensing measurement could be extremely costly to obtain and one would probably welcome any means of recovering the initial signal with the minimum amount of measurements. Part of the thinking for this algorithm comes from two hunches:
  • Different families of reconstruction algorithms (LP, IT, Greedy, Reweighted .....) converge toward to the same solutions exactly or to an epsilon. However, different algorithms are shown to converge in theory and practice with different minimum numbers of measurements. This is rather unsettling. how do different reconstruction solver qualitatively handle the same set of CS measurements ? If a solution is found with two solvers from different families of algorithm, we have probably reached a solution.
  • Let us say that two projecting vectors are close to being colinear. If a vector were to be decomposed in a sum of these two vectors, its coefficients would be very large and may prove difficult to solve for certain solvers (a finding similar to non-normality). Removing one of these two vectors and replacing it with a vector that is not too close to being colinear to the other one will render the problem well-posed. 

In a different area, here is a very nice mind map by Hon-dani on Manifold Based Signal Recovery:

and another one here. I am expecting another one from another reader of this blog at some point in time (wink, wink). (It is one of these rare instances where I am in a situation in which I cannot ask for permission to repost an item. I tried to ask for permission to repost this mindmap but the site is in japanese and the comment section seems to be closed unless I register on the japanese site. Since I do not read Japanese, I am stuck. If the author wants me to remove it, please e-mail me  and I'll do so in a heart beat).

A novAlign Centerel counter architecture, called Counter Braids, has recently been proposed for per-flow counting on high-speed links. Counter Braids has a layered structure and compresses the flow sizes as it counts. It has been shown that with a Maximum Likelihood (ML) decoding algorithm, the number of bits needed to store the size of a flow matches the entropy lower bound. As ML decoding is too complex to implement, an efficient message passing decoding algorithm has been proposed for practical purposes. The layers of Counter Braids are decoded sequentially, from the most significant to the least significant bits. In each layer, the message passing decoder solves a sparse signal recovery problem. In this paper we analyze the threshold dimensionality reduction rate (d-rate) of the message passing algorithm, and prove that it is correctly predicted by density evolution. Given a signal in Rn+ with n non-vanishing entries, we prove that one layer of Counter Braids can reduce its dimensionality by a factor 2.08 · \epsilon log (1/\epsilon) + O(). This essentially matches the rate for sparse signal recovery via L1 minimization, while keeping the overall complexity linear in n.

From the article:

Comparison with Compressed Sensing

Compressed sensing [5], [6] reduces below Nyquist rate the number of samples needed to recover sparse signals. In other words, it reduces the dimensionality of signal known to be sparse, using suitable non-adaptive projections.
Counter Braids, on the other hand, compresses a signal with decreasing digit entropy: it reduces the actual number of bits needed to store the signal and achieves the Shannon entropy lower bound. Interestingly, decoding each layer of CB also solves a dimensionality reduction problem. By reducing the number of counters in each layer, and assigning an appropriate number of bits per counter, CB achieves an overall reduction in bits. In this way, CB performs compression via dimensionality reduction.
and the conclusion:

We analyzed the achievable dimensionality reduction rate for a single-layer Counter Braids and found it to be very close to Donoho and Tanner threshold for non-negative signals with the L1 minimization recovery algorithm. Since the complexity of message-passing algorithm is essentially linear, it is a more efficient solution to non-negative sparse signal recovery than L1 minimization.

I also found two upcoming talks 

* MIT Laboratory for Information & Decisio Systems, November 20, 2008, 4:15p.m. - 5:15p.m., Building 32, Room D677,  Counter Braids: A novel counter architecture for network measurement by Yi Lu

Accurate flow-level measurement is highly desirable in networking products. As network speed increases, the direct approach of dynamically assigning per-flow counters requires a large amount of fast memory, which is prohibitively expensive. We observe that measurement with limited memory inherently benefits from source coding, and propose “Counter Braids”, an architecture that compresses as it counts. The counts are decompressed offline using a novel message passing algorithm, which also has implications for the area of compressed sensing.

Tuesday, November 18, 2008

CS: Compressed Sensing and Sub-Nyquist Sampling at DSP 2009

Thomas Blumensath just sent me this :
I am currently organising a special session on compressed sensing and sub-Nyquist sampling to be held at the 16th International Conference on Digital Signal Processing (DSP2009) in Santorini, Greece, July 5-7, 2009. In addition to invited papers, I also encourage general contributions from the wider community.

The call for papers can be found here:

More information on the conference is available here:

From the pdf:

Submissions are invited for a special session on Compressed Sensing and Sub-Nyquist Sampling to be held at the 16th International Conference on Digital Signal Processing (DSP2009) in Santorini, Greece, July 5-7, 2009.

For over 50 years, sampling theory has been focused around a result attributed to, among others, Nyquist and Shannon, stating that signals that are band-limited can be sampled and reconstructed using samples taken at a rate greater than twice the signal bandwidth. Recently, the focus has shifted to signals with structures other than band-limitedness. Two types of signal models stand out. In compressed sensing a sparse signals model is assumed, that is, signals are assumed to be well approximated using a small number of basis elements. The other model is the finite rate of innovations model, which is a parametric signal model that has a finite number of degrees of freedom in each time interval. Both of these models allow the derivation of sampling procedures and reconstruction algorithms. Importantly, the required number of samples is generally proportional to the information content of the signal which can be significantly lower than the number of samples required by the Nyquist limit. At the heart of current research into compressed sensing and other sub-Nyquist sampling methods is the interplay between signal models, sampling operators and reconstruction algorithms. Important questions revolve around the specification of accurate signal models that exploit signal structure, the development and study of sampling systems that preserve the relevant signal information and the derivation of fast and provably efficient algorithms to reconstruct the signal form the samples.
This special session on Compressed Sensing and Sub-Nyquist Sampling, to be held in Santorini during the 2009 16th International Conference on Digital Signal Processing (July 5 – July 7, 2009), is a dedicated session that aims to bring together a diverse range of current work in this fast developing and interdisciplinary field. Contributions are solicited in the broad area of compressed sensing and other sub-Nyquist sampling methods. Topics can include, but are not limited to:
  • Signal models
  • Finite rate of innovations models
  • Sparse models
  • Tree structures
  • Positivity constraints
  • Multiple measurement vectors
  • Measurement system properties and design
  • Recovery algorithms
  • Bayesian methods
  • Theoretical aspects
  • High dimensional geometry
  • Information Theory
  • Performance bounds
  • Analogue to information conversion
  • Finite rate of innovations sampling
  • Analogue compressed sensing
  • Applications
  • Imaging
  • Distributed sensing
  • Communications

Important Dates
Deadline for paper submission – Special Session February 15, 2009
Acceptance Notification March 15, 2009

This event was added to the calendar.

Credit photo: me. Greenland looking South from a plane at 30,000 feet.

Monday, November 17, 2008

CS: Real Time Elimination of Undersampling Artifacts in CE MRA using Variational Denoising on Graphics Hardware

The folks at Graz University of Technology in Austria have come up with a way to implement Chambolle's algorithm for CS-TV using an FPGU as featured in the following presentation entitled:

Real Time Elimination of Undersampling Artifacts in CE MRA using Variational Denoising on Graphics Hardware by Florian Knoll, Markus Unger, Franz Ebner, Rudolf Stollberger. The abstract reads:
Undersampled imaging strategies with state of the art reconstruction methods like compressed sensing, which reformulate image reconstruction as a constrained optimization problem, have the potential to deliver CE MRA images with high spatial and temporal resolution. The drawback of these algorithms is their long reconstruction time which makes it impossible to use them in clinical practice. This study demonstrates that these optimization problems can be solved on modern graphic processing units (GPUs), with computation times that allow real time imaging.
The paper is here and the associated video is here.

Here is the noteworthy excerpt from the conclusion of the paper:

The GPU implementation facilitates image reconstruction times that are far below the corresponding acquisition times.

I have added this hardware into the Compressive Hardware section.

Saturday, November 15, 2008

CS: This Week's Comments Round Up

I could not respond rapidily to some comments this week, so I decided to make it an entry. Being on the road, I cannot produce this morning's Saturday morning cartoon. However, it looks like the first Saturday morning cartoon was featured in Reddit which gave a boost to its stats (now at 669 views):
Creepy ? I am not sure, it definitely led one commenter to ask:
Can anyone point to a good introductory tutorial for compressed sensing?

If a video can get people to be interested in the matter, this is a good thing. Tutorials can be found in the Big Picture Introduction. The video was also featured in the Anti Memoirs blog. I don't read Farsi but I think the commenters liked it

A comment on yesterday's entry written by Anonymous: 

wow l1 constraints for portfolio optimization, brilliant! we actually managing money have never thought of that (and seen that it doesn't mean shit in practice). our feeble little minds just haven't been able to make the visionary leap required to go from the l2 to the l1!

l-1 regularization solvers are rooted in empirical results found in Geophysics in the 1970's. The fact that there was no theoretical justification for this technique did not deter researchers in the field to use it extensively. The portfolio paper as presented by Ingrid Daubechies in the video provides a firmer justification for using this technique. As she shows, it seems that most of the emphasis in the portofolio field has been on identifying the l_1 minimization as a way of obtaining positive solutions but little understanding seemed to have been gained from the fact that the solver usually also solved for the sparsest solutions. This is a key insight, it opens the door to techniques of compressed sensing for the acquisition of "sparse" positions. In effect, one wonders how the work on identifying good measurement matrices might provide an original investement strategy.

I realize that traders have a difficult job, especially the ones working for formerly non-FDIC regulated entities who may now be working under a different set of rules. However, this is a working paper for an international regulator (the ECB). As an average citizen, I sure would want the ECB or any other Feds to have an understanding on how to wisely spend worldwide tax-payer's money in different bail-outs. For one, we are beginning to see that commercial actors requiring bailouts is becoming non-sparse, and any insight on how to make this a sparser set would be a welcoming sight. 

In another entry, Matthieu asked the following:

I'd like to know a little bit more about "It may be a good scheme for the Arduino,wink, wink :-)".

What do you mean? For what purpose do you use it?

Have you ever been considering new OMAP3 based hardware like Beagleboard and Gumstix Overo?
I was hinting to a discussion I had with another reader. This idea is that we are beginning to see many different low-cost and easy to use platforms destined to acquire signals from the physical world. How can one use these platforms to implement low cost and easily implementable CS hardware ? The random lens imager is a perfect example of this thinking (even though it still requires some work on the calibration side). The Rice single pixel camera and its illumination based sister are a little expensive but I don't think that the Hyperspectral imager at Duke is that expensive. I'll definitely come back to this idea later.

Finally, in another entry, Lloyd Belleza
hi. u have here interesting cs topics. i am an IT student and trying to look for a thesis topic. if ever you could help me come up with a topic, i would surely appreciate it. thank u

If anybody has an idea, please contact Lloyd.

Credit Photo: me. View of Greenland from 30,000 feet.

Friday, November 14, 2008

CS: CS in Finance and in the Oil Business."You’ve got Terry Tao talking to geoscientists, what do you want?", Toad Classification and Some Seminars

I had seen some empirical use of the L_1 minimization technique used in finance before. However, the talk by Ingrid Daubechies on portfolio management ( the video is here in the flv format) and the l_1 (and l_p) regularization brings some sound arguments and theoretical discussion to the subject. The paper is entitled: Sparse and stable Markowitz portfolios, by Joshua Brodie, Ingrid Daubechies, Christine De Mol, Domenico Giannone and Ignace Loris, The abstract reads:

We consider the problem of portfolio selection within the classical Markowitz meanvariance framework, reformulated as a constrained least-squares regression problem. We propose to add to the objective function a penalty proportional to the sum of the absolute values of the portfolio weights. This penalty regularizes (stabilizes) the optimization problem, encourages sparse portfolios (i.e. portfolios with only few active positions), and allows to account for transaction costs. Our approach recovers as special cases the noshort-positions portfolios, but does allow for short positions in limited number. We implement this methodology on two benchmark data sets constructed by Fama and French. Using only a modest amount of training data, we construct portfolios whose out-of-sample performance, as measured by Sharpe ratio, is consistently and significantly better than that of the naive evenly-weighted portfolio which constitutes, as shown in recent literature, a very tough benchmark.

Because one of the author works at the European Central Bank, this working paper is hosted on  their site. One would hope that cross disciplinary investigations like this one are given some impetus in other directions such as enforcement and market understanding. Trust can only be enabled when the Feds have a real ability to understand and act on crises such as the one we are witnessing (see the previous entry Trust but Verify). While we are on the subject of interdisciplinary research I noticed the following in a recent IPAM brochure about Mark Green:

"He [Mark Green ] allows himself “one small success story”: The 2004 program, Multiscale Geometry and Analysis in High Dimensions, brought together some of the greatest minds in pure mathematics: UCLA math professor Terry Tao, his Caltech collaborator Emmanuel Candes and Justin Romberg (Georgia Tech). During the program, they produced breakthrough results in compressed sensing, an area that has concrete and emerging applications in imaging, astronomy, MRI, signal processing, and linear coding, among others. This research is now the subject of a Defense Advanced Research Projects Agency (DARPA) program with $25 million in funding, which Mark points out is more money that IPAM has spent in its entire existence. His favorite moment of the program came when the NSF panel – which was simultaneously conducting its first site visit – asked: Is this interdisciplinary work? Participant David Donoho (statistics, Stanford), another major contributor to the genesis of compressed sensing, reportedly exclaimed, “You’ve got Terry Tao talking to geoscientists, what do you want?

In the meantime, one can always apply the approach given in the portofio paper to the Million dollar portfolio competition organized by CNBC. If one were to have a matrix R made up of columns indexed with a discrete time scale and rows indexed on the choice of particular forex exchange pair, the R matrix would, unlike the case of this paper, be underdetermined and would resemble a measurement matrix found in compressed sensing. The solution to this problem would be a sparse vector representing the time when the investor has taken position on the Forex market. Since it would take forever to find out if that measurement matrix follows the restricted isometry property it is likely that CS will not give an edge to whoever is using it for the competition.

Here is a new thesis from MIT entitled Estimation of channelized features in geological media using sparsity constraint by Behnam Jafarpour. The abstract of the thesis reads:
In this thesis, a new approach is studied for inverse modeling of ill-posed problems with spatially continuous parameters that exhibit sparseness in an incoherent basis (e.g. a Fourier basis). The solution is constrained to be sparse in the transform domain and the dimension of the search space is effectively reduced to a low frequency subspace to improve estimation efficiency. The solution subspace is spanned by a subset of a discrete cosine transform (DCT) basis containing low-frequency elements. The methodology is related to compressive sensing, which is a recently introduced paradigm for estimation and perfect reconstruction of sparse signals from partial linear observations in an incoherent basis. The sparsity constraint is applied in the DCT domain and reconstruction of unknown DCT coefficients is carried out through incorporation of point measurements and prior knowledge in the spatial domain. The approach appears to be generally applicable for estimating spatially distributed parameters that are approximately sparse in a transformed domain such as DCT. The suitability of the proposed inversion framework is demonstrated through synthetic examples in characterization of hydrocarbon reservoirs.

Behnam Jafarpour is now a professor of petroleum engineering at Texas A&M and lists " Basis Pursuit and Sparsifying Transforms for Reservoir Parameterization" as one of his research interest.

I also found the following paper entitled:
Lightweight Acoustic Classification for Cane-Toad Monitoring by Thanh Dang and Nirupama Bulusu and Wen Hu. The abstract reads:

We propose a light weight algorithm to classify animals (eg. cane-toads) based on their vocalizations using sharply resource-constrained acoustic sensors. The goal is to enable fast in-network animal classification at the resource constrained sensors so as to minimize energy consumption. Each sensor randomly and independently samples a signal at a sub-Nyquist rate. The vocalization envelopes are extracted and matched with the original signal envelopes to find the best match. The computational complexity of the algorithm is O(n). It also requires less than 2KB of data memory. Our experiments on frog vocalizations show that our approach performs well, providing an accuracy of 90% and a miss rate of less than 5%.

Finally, here a list of upcoming Seminars found on the interwebs:
They are all listed on the calendar.

Credit Images:IMA video screenshot,  Wikipedia section on cane-toad,