## Page Views on Nuit Blanche since July 2010

My papers on ArXiv:
Approximating Kernels at the speed of Light
&
Imaging with Nature

LightOn
LinkedIn (727)|| on CrunchBase || our Blog
(2452)
(3967)
(1333)||
Attendant references pages:
The Advanced Matrix Factorization Jungle Page ||

Paris Machine Learning
@Meetup.com (8016 members) || @Archives

## Saturday, December 29, 2007

### Compressed Sensing: Random Features for Large-Scale Kernel Machines

In a previous, I mentioned the potential connection between compressed sensing and the visual cortex through a model that uses random projections. At the latest NIPS conference, it looks like we are beginning to see some convergence in Random Features as an Alternative to the Kernel Trick by Ali Rahimi and Benjamin Recht. The matlab code is on the main webpage. The abstract reads:

To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shift invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large-scale kernel
machines.
Thanks Jort. So now the classification mechanism is actually simplified through the use of random projections. This is thought provoking.

## Friday, December 28, 2007

### An opportunity for Opportunity ?

After some additional trajectory analysis, there is now a 1 to 25 chance of the 2007 WD5 meteorite hitting Mars on January 30th. If it hits, it will be a 30 megaton impact in a region north of where Opportunity stands. The last time we had an occurrence like this one was when Comet Shoemaker-Levy hit Jupiter in 1994. Many different telescopes were watching the impact then. But this time, it will be the first time in History that one of our robot sees an impact from the point of the view of the planet being hit. wow. Much can be learned about Mars atmosphere and even geology from such an event that it is mind boggling.

On a different note, this news is really possible because of one man: Gene Shoemaker. Gene is one of the discoverer of the Shoemaker-Levy comet that got people to realize the devastation Earth could sustain if it were to be hit by a similar object. His discovery led to the creation of the Near-Earth Object division at NASA. Gene actually trained to be the first scientist on the Moon and lobbied hard within NASA so that an Apollo astronaut would have the right qualification to pick up the right rocks. He was disqualified for health reasons but his lobbying paid off and Harrison Schmitt (a geologist) got to take that slot on Apollo 17. Gene died in a car accident in 1994. On the one hand, you don't get to walk on the Moon, on the other hand, your legacy extends far beyond the human exploration of space: It makes Space relevant to our history and maybe even our survival. This was not a bad deal after all.

## Thursday, December 27, 2007

### Building a Roaming Dinosaur: Why is Policy Learning Not Used ?

Dubai is going to host a Jurassic Park. It is about time: the current display of Animatronics are very much underwhelming and certainly do not yield the type of magic moment as displayed by Laura Dern's face in Jurassic Park. Yet, all over the world, kids and families line up to pay from $10 to$100 for the privilege of being wowed. The most interesting feature of the Dubai 'Resteless Planet' undertaking will be the claimed ability for the dinosaurs to be roaming. This is no small feat as none of the current animatronics are able to do that. I have a keen interest in this as you probably have noticed from the different entries on muscles, scaling and various autonomous robotics undertaking.

So over the years, I have kept an eye on the current understanding of dinosaur gait. Examples on the web can be found here and here. A sentence seems to be pretty much a good summary:

When the American Museum of Natural History wanted to create a digital walking Tyrannosaurus rex for a new dinosaur exhibit, it turned to dinosaur locomotion experts John Hutchinson and Stephen Gatesy for guidance.

The pair found the process humbling.

With powerful computers and sophisticated modeling software, animators can take a pile of digital bones and move them in any way they want. This part is easy; but choosing the most likely motion from the myriad of possibilities proved difficult......

The researchers think that one way to narrow down the possibilities of dinosaur movement is to use more rigorous physical constraints in computer models. These constraints fall into two broad categories: kinematic (motion-based) and kinetic (force-based). One simple kinematic constraint, for example, is that the ankle and knee cannot bend backwards.
So in short, it is one thing to know the skeleton, it is another one to devise how the movement goes. Yet, the current methods devised to figure gait are relying on not so sophisticated methods. As it turns out, we have a similar problem in robotics and machine learning. Because robots are becoming increasingly complex, there needs to be new methods of collecting data and summarizing them in what are called 'policies'. New methods are able to learn behavior for robots even though they have many degrees of freedom though some type of supervised learning. Some of the techniques include Non-Negative Matrix Factorization (NMF), diffusion processes and some of the techniques we tried in our unsuccesful attempt in DARPA's race in the future.

[ Update: a dinosaur finding showing preserved organic parts shows us that basing our intuition on just bones is not enough. It looks as though dinosaurs may have been much larger. Ona different note, it is one thing to model human behavior (and by extension dinosaur behavior) using Differential equations,but the problem you are trying to solve is 'given a certain behavior, how can it fit the model set forth by the differential equations?'. This is what is called an inverse problem and while a set of differential equations may give you a sentiment that you are modeling everything right, they generally are simplification of the real joint behavior and their interaction with the environment (soil,...). In short, to give a sense of realness, you have to go beyond a description with differential equations alone, for these reasons alone. For this reason, building a real roaming dinosaur need the type of undertaking mentioned above in this entry ]

## Wednesday, December 26, 2007

### Compressed sensing: CS for networked data, Diffusion wavelets and Analog CS measurements.

The Special Issue on Compressive Sampling of the IEEE Signal Processing Magazine will also feature Compressed sensing for networked data by Jarvis Haupt, Waheed Bajwa, Mike Rabbat, and Robert Nowak. In the article, one can learn how one can exploit the connectivity in the network "as the key to obtaining effective sparsifying transformations for networked data". Hence because Network traffic patterns are sparse in the diffusion wavelet framework, the idea is to use this framework to enable Compressed Sensing measurements from networks.

They also discuss the specific schemes one could implement with a particular attention to data from wireless networks and the analog production of the compressed sensing data through the use of the network. Very nice.

## Tuesday, December 25, 2007

### Compressed Sensing: How could Hardware implementations become disruptive technologies ?

I am still wondering how technology developed with Compressed Sensing could be construed at Disruptive Technology, here is reminder of what disruptive technology is:

"...Disruptive technologies do not destroy existing market leaders overnight. They do not get adopted by the entire market at the same time. They do not initially seem to be "better" products (in fact, in the early going, they are often distinctly "worse.") They are not initially a viable option for mainstream users. They do not win head-to-head feature tests. Initially, they do not even seem to be a threat.

Disruptive technologies take advantage of a new manufacturing/business process or technology to provide a cheaper, more convenient, simpler solution that meets the needs of the low end of the market. Low-end users don't need all the features in the Incumbent's product, so they rapidly adopt the simpler solution. Meanwhile, the Incumbent canvasses its mainstream customers, reassures itself that they want the feature-rich products, and dismisses the Disruptor as a niche player in an undesirable market segment. The Incumbent may dabble with the new technology or process, but only in service of its existing business model.

Then the Disruptor improves its products, adding more features while keeping the convenience and low cost. Now the product appeals to more mainstream users, who adopt it not because it's "better" but because it's simpler and cheaper. Seeing this, the Incumbent continues adding ever more features and functionality to its core product to try to maintain its value proposition for higher end customers. And so on. Eventually, the Incumbent's product overshoots the needs of the mass market, the Disruptor grabs the mainstream customers, and, lo and behold, the technology has been "disrupted."

In a previous entry, I was wondering How could Compressed Sensing be seen as a disruptive technology ?
and used some guidance given by Todd Proebsting when he was giving a talk on innovation in programming languages at LL2.

First of all, Compressed Sensing is really a technique not a technology, so I will try to address the technologies that are beginning to use CS at and see if those can be considered disruptive technologies.

1. In the case of MRI, Compressed Sensing should clearly not be labeled as disruptive as it has clearly been adopted by the large MRI companies to increase the throughput of their machines.
2. In the Rice single pixel camera case, there is currently a clear advantage in the realm of imaging parts of the light spectrum where the development many sensitive pixels is clearly too expensive. The other interesting aspect of the single pixel imager is the ability to remove the compression scheme which takes most of the battery. But in an age where batteries are seen as the technology to improve, should an imaging hardware be developed just for the purpose of being less power hungry ?
3. In the case of hyperspectral imagers, any current hyperspectral imager cost between 70K$to 100K$. A low cost product while not being very efficient (accurate in certain bands) would clearly make the CS hyperspectral imager as a disruptive technology. One of the major underlying reason as to why CS hyperspectral imager have a reasonable possibility of creating a niche market is the ability to provide a large field of view and dynamic range compared to current technology.

For each of these technologies I will try to address the following in the future:

• Opportunity: what’s the issue?
• Current solutions: what’s done now
• Proposal: New disruptive technology
• Disadvantages why some (many?) will scoff
• Unmet needs benefits to adopters:
I am sure most of you can make out what the "many will scoff" will be made off ?

### Monday Morning Algorithm 8: An Example of Compressed Measurements Speeding Up Diffusion Maps

In the previous installment of the Monday Morning Algorithm, we featured the diffusion method. I also mentioned that one could use compressed sensing and obtain the same result: i.e the ability to sort several high dimensional images because only one parameter is being varied. When going through this exercise, one can note that of the most expensive operation is the inter-distance measured between all the elements of the set. If every picture is 10 MP, one has to evaluate the difference between elements made of ten million numbers. In Compressed Sensing, one learns that projecting a large image on the right kind of random projections, one can keep the amount of information in the original image. We are going to reuse our last algorithm but we will be projecting the original images of Buzz Lightyear on a smaller space. We will be using a Gaussian ensemble and will be projecting from an image of dimension 57600 down to 100 Compressed Measurements. Let us note that this not an efficient operation unless we have directly access to a compressed measurements or if one does extra operations with the compressed measurements.

clear
% Stephane Lafon, “Diffusion maps and geometric harmonics”,
% Ph.D. dissertation, Yale University (May
% 2004)
% written by Cable Kurwitz and Igor Carron
% This algorithm shows how random projections
% used in compressed sensing provides the same
% results as the original images.
%
% Number of Random Projections
number_RP =100;
% get all the files in jpg
d = dir('buzz*.jpg')
ntotal = size(d,1)
for i=1:ntotal
d(i).name
end
s=size(Img,1);
w=size(Img,2);
hd=size(Img,3);
swhd=s*w*hd;
%reshape these images into one vector each
for ii=1:ntotal
b1(:,ii) = double(reshape(Img, swhd,1));
end
%
% Compressed sensing part
sa=randn(number_RP,swhd);
for i=1:swhd
sa(:,i)=sa(:,i)./norm(sa(:,i));
end
b=sa*b1;
%b=b1;
%
%
for i=1:ntotal
for j=1:ntotal
D1(i,j)=norm(b(:,i)-b(:,j),2);
end
end
%
n1=ntotal;
D_sum=0;
for l = 1:n1;
D_sum = D_sum + min(nonzeros(D1(l,:)));
end;
epsilon = D_sum/n1/10; %
%
% Step 1
%
kernel_1 = exp(-(D1./epsilon).^1);
%
% Step 2
%
one_V = ones(1,n1);
%
% Step 3
%
p = kernel_1*one_V';
kernel_2 = kernel_1./((p*p').^1);
%
% Step 4
%
v = sqrt(kernel_2*one_V');
%
% Step 5
%
K = kernel_2./(v*v');
%
% Step 6
%
%% Singular Value Decomposition
[U S V] = svd(K);
% Normalization
for i=1:ntotal-1
phi(:,i) = U(:,i)./U(:,1); %
end
eig_funct = phi(:,1:6);
%
% The eigenvalues of \delta are approximated by those of K, and its
% eigenfunctions Ái are approximated by phi(:,i) = U(:; i):=U(:; 1)
%
vv=eig(K);

% Initial order of Buzz images
figure(1)
for i=1:ntotal
subplot(3,4,i)
imshow(d(i).name)
end

X = vv(ntotal-1).*phi(:,2);
% sorting the Buzz images using the first
% eigenfunctions the Laplace Beltrami operator
[Y,I] = sort(X);
% d(I).name
%
figure(2)
for i=1:ntotal
subplot(3,4,i)
imshow(d(I(i)).name)
end
%
figure(3)
plot(vv(ntotal-1).*phi(:,2),'.')
title(' Buzz Lightyear')
xlabel('Index')
ylabel('Lambda_1 phi_1')
%
figure(4)
plot(vv(ntotal-1).*phi(:,2),vv(ntotal-2).*phi(:,3),'.')
title(' Buzz Lightyear')
xlabel('Lambda_1 phi_1')
ylabel('Lambda_2 phi_2')
%
figure(5)
plot(eig(K),'*')
xlabel('Index of eigenvalue')
ylabel('Value of eigenvalue')
title(' Eingenvalues of K')
vv=eig(K);
%
It looks like 100 projections are enough to recover the same results as the original analysis. One can note that this process of figuring out what is the number of random projections needed would require the type of analysis evaluating the intrinsic dimensionality of the set as mentioned here by the folks at Rice.

All the Monday Morning Algorithms (and attendant codes/files) are listed here.

## Sunday, December 23, 2007

### Compressed Sensing: Fast Compressive Sampling with Structurally Random Matrices

Thong Do mentions that the paper presented here is already on the Rice website. Thong Do, Trac Tran and Lu Gan are co-authors of the fascinating Fast Compressive Sampling with Structurally Random Matrices which seems to lead to highly sparse measurement matrices. The abstract reads:

This paper presents a novel framework of fast and efficient compressive sampling based on the new concept of structurally random matrices. The proposed framework provides four important features. (i) It is universal with a variety of sparse signals. (ii) The number of measurements required for exact reconstruction is nearly optimal. (iii) It has very low complexity and fast computation based on block processing and linear filtering. (iv) It is developed on the provable mathematical model from which we are able to quantify trade-offs among streaming capability, computation/memory requirement and quality of reconstruction. All currently existing methods only have at most three out of these four highly desired features. Simulation results with several interesting structurally random matrices under various practical settings are also presented to verify the validity of the theory as well as to illustrate the promising potential of the proposed framework.
I cannot wait to see how it will fare with regards to reconstruction capabilities, i.e the ability of having very sparse measurement matrices and still allow for few measurements and good reconstruction. I need to look into this deeper.

### Compressed Sensing: MCA news and Sublinear Recovery of Sparse Wavelet Signals

Coming back from NIPS 07, Jort Gemmeke mentioned the following two items:

Morphological Component Analysis : an Adaptive Thresholding Strategy by Jerome Bobin, Jean-Luc Starck, Jalal Fadili, Yassir Moudden, and David Donoho, the abstract reads:
In a recent paper, a method called MCA (Morphological Component Analysis) has been proposed to separate the texture from the natural part in images. MCA relies on an iterative thresholding algorithm, using a threshold which decreases linearly towards zero along the iterations. This paper shows how the MCA convergence can be drastically improved using the mutual incoherence of the dictionaries associated to the different components. This modified MCA algorithm is then compared to Basis Pursuit, and experiments show that MCA and BP solutions are similar in terms of sparsity, as measured by l1 norm, but MCA is much faster and gives us the possibility to handle large scale data sets.
The MOM/matched filtering and thresholding bears some similarity with another paper mentioned in this entry from some of the same co-authors. I haven't looked at it deeper yet. Always on the same topic of Morphological Component Analysis, Jort also mentions the release of the MCA 802 C++ toolbox from Jalal Fadili's webpage.

Toward a dedicated Compressed Sensing based on the signal signature? It looks like Ray Maleh and Anna Gilbert are beginning to do that in Sublinear recovery of sparse wavelet signals. The abstract reads:
There are two main classes of decoding algorithms for “compressed sensing,” those which run time time polynomial in the signal length and those which use sublinear resources. Most of the sublinear algorithms focus on signals which are compressible in either the Euclidean domain or the Fourier domain. Unfortunately, most practical signals are not sparse in either one of these domains. Instead, they are sparse (or nearly so) in the Haar wavelet system. We present a modified sublinear recovery algorithm which utilizes the recursive structure of Reed-Muller codes to recover a wavelet-sparse signal from a small set of pseudo-random measurements. We also discuss an implementation of the algorithm to illustrate proof-of-concept and empirical analysis.

## Saturday, December 22, 2007

### Compressed Sensing, Compressive Sampling: A Customized Search Engine

One of the reason I link to the web pages of people involved in Compressed Sensing is twofold. It is designed to:
• see what their other interests are and,
• help Google customized search engine focus on the subject of Compressed Sensing.
If you use the search box below (also on the right side) you will specifically search through all the pages of these researchers and the webpages they link to. To the students writing papers on the subject, please make a web page (even a tiny placeholder), it will go a long toward establishing an hopefully long lasting presence on the web in the future.

## Friday, December 21, 2007

### Compressed Sensing: Intrinsic Dimensionality of a Dataset, StOMP for large scale problems and 2007 WD5 is aiming at Mars

In Random Projections for Manifold Learning, Chinmay Hegde, Mike Wakin, and Richard Baraniuk try to devise a means by which one can determined the intrinsic dimensionality of a dataset. This is important as it has a direct impact on the size of the measurement matrix that one wants to be as small as possible. It uses the Grassberger-Procaccia algorithm to estimate the intrinsic dimension of the manifold using random projections of the original dataset. The abstract reads:

We derive theoretical bounds on the performance of manifold learning algorithms, given access to a small number of random projections of the input dataset. We prove that with the number of projections only logarithmic in the size of the original space, we may reliably learn the structure of the nonlinear manifold, as compared to performing conventional manifold learning on the full dataset.

In the NIPS version of that technical report, the abstract reads:

We propose a novel method for linear dimensionality reduction of manifold modeled
data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N, meaning that K less than M much less than N. To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
in the paper the algorithm is given in an explicit way and requires ISOMAP. These two papers are important for Machine Learning as they now give a faster mean of estimating the underlying manifold using random projections.

David Donoho, Yaakov Tsaig, Iddo Drori, and Jean-Luc Starck, Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit, the abstract reads:

Finding the sparsest solution to underdetermined systems of linear equations y = x is NP-hard in general. We show here that for systems with ‘typical’/‘random’ , a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our proposal, Stagewise Orthogonal Matching Pursuit (StOMP), successively transforms the signal into a negligible residual. Starting with initial residual r0 = y, at the s-th stage it forms the ‘matched filter’ T rs−1, identifies all coordinates with amplitudes exceeding a specially-chosen threshold, solves a least-squares problem using the selected coordinates, and subtracts the least squares fit, producing a new residual. After a fixed number of stages (e.g. 10), it stops. In contrast to Orthogonal Matching Pursuit (OMP), many coefficients can enter the model at each stage in StOMP while only one enters per stage in OMP; and StOMPtakes a fixed number of stages (e.g. 10), while OMP can take many (e.g. n). StOMP runs much faster than competing proposals for sparse solutions, such as l1 minimization and OMP, and so is attractive for solving large-scale problems.
We use phase diagrams to compare algorithm performance. The problem of recovering a k-sparse vector x0 from (y, ) where  is random n × N and y = x0 is represented by a point (n/N, k/n) in this diagram; here the interesting range is k less than n less than N. For n large, StOMP correctly recovers (an approximation to) the sparsest solution of y = x over a region of the sparsity/indeterminacy plane comparable to the region where l1 minimization is successful. In fact, StOMPoutperforms both l1 minimization and OMP for extremely underdetermined problems. We rigorously derive a conditioned Gaussian distribution for the matched filtering coefficients at each stage of the procedure and rigorously establish a large-system limit for the performance variables of StOMP. We precisely calculate large-sample phase transitions; these provide asymptotically precise limits on the number of samples needed for approximate recovery of a sparse vector by StOMP. We give numerical examples showing that StOMPrapidly and reliably finds sparse solutions in compressed sensing, decoding of error-correcting codes, and overcomplete representation.

Always pay attention when Dave Donoho says something. I also found the qualifying exams slide of Deanna Needell interesting, especially slide 116/145 an information I did not know. I am also impatient to see what Thong Do a student of Trac Tran is talking about in this seminar:
In this work, we propose a novel fast compressive sampling algorithm that is based on a new concept of structurally random ensembles. The new proposed algorithm not only enables the system to compressively sample signals very fast but also requires very low complexity with other highly desired features for practical applications such as: integer friendliness, streaming capability. Surprisingly, this novel algorithm is also proposed to solve a closely related problem of high dimensional reduction in theoretical computer sciences.

I mentioned earlier the Titan flyby of today, but guess what kids, this one trumps it: 2007 WD5, an asteroid is going to hit Mars where we currently have five robots ( Spirit and Opportunity, Odyssey, Mars Express and the Mars Reconnaissance Orbiter). Woohoo, where is my popcorn ?.

## Thursday, December 20, 2007

### Compressed Sensing: Compressive sampling vs conventional imaging, the Birth of the Everlasting Dataset

I just found this presentation entitled Compressive Sensing: A New Framework for Imaging by Richard Baraniuk, Justin Romberg and Robert Nowak which seems to be a tutorial presentation for ICIP 2006. Much of this tutorial appear in previous presentations except for the presentation part starting at slide 93 to slide 120, I have found this part of the presentation very refreshing as I am beginning to see the connection between results of spectral theory of Random Matrices and Compressed Sensing. At some point the subject will merge with this one but I am not sure how for the time being.

Another aspect of the image reconstruction, which I had not seen before, is also explicitly presented. Once you have retrieved random measurements/projections, Compressed Sensing techniques enable you to reconstruct an image with different bases. If, in 20 years, we have better bases, the same measurements can be used and will reproduce a better image than the one we can currently reconstruct. Case in point, if we had done compressed sensing in the mid-1990's, we would have reconstructed images with edges using wavelets bases. Then curvelets came up , using the same measurements we would have been able to get a clearer imageright when Emmanuel Candes discovered the basis!

The underlying reason being that curvelets provide better approximation properties to edges, hence providing a sparse representation of images (most natural images are made of edges).

The presentation (excerpted from reference [1]) then goes on to show a comparison between recovery using Compressed Sensing with different bases and a normal pixel sampling.

The wedgelets are coming from this implementation. The conclusions are nice summary:

"Compressive sampling techniques can yield accurate reconstructions even when the signal dimension greatly exceeds the number of samples, and even when the samples themselves are contaminated with significant levels of noise. CS can be advantageous in noisy imaging problems if the underlying image is highly compressible or if the SNR is sufficiently large"
I can see how there is going to be similar activity in the field of data mining where the underlying data dimension is larger than 2 and for which there is not currently a reasonable basis to decompose datasets on. Images are 10 MB large at most and get reduced to about 1 MB jpeg (this gives an idea of the compression using the DCT). Compressed Sensing with current bases yield about 4 MB of random measurements a reduction of about 60 %. With this in mind, one wonders how many random measurements would be needed to eventually provide a good reconstruction on different types of datasets. Does taking 47 random measurements of Shakespeare's Hamlet provide enough information to reconstruct it ? Maybe not today, if not, when ?

Reference: [1] Jarvis Haupt and Robert Nowak, Compressive sampling vs conventional imaging

## Wednesday, December 19, 2007

### Compressed Sensing Thoughts, Diffusion maps methods for Brain Imaging and for Computer Generated Graphics

Muthu Muthukrishnan is wondering aloud about the substring matching problem and how it can be used to find the reconstruction of a signal made of subsignals that are sparse using the compressed sensing framework.

I also found items of interest and relevant to dimensionality reduction applied to different domains. The first one is an article on Recovering cerebral white matter structures with Spectral Clustering of Diffusion MRI Data by Demian Wassermann, Maxime Descoteaux and Rachid Deriche.

White matter fiber clustering allows to get insight about anatomical structures in order to generate atlases, perform clear visualizations and compute statistics across subjects, all important and current neuroimaging problems. In this work, we present a Diffusion Maps clustering method applied to diffusion MRI in order to cluster and segment complex white matter fiber bundles. It is well-known that Diffusion Tensor Imaging (DTI) is restricted in complex fiber regions with crossings and this is why recent High Angular Resolution Diffusion Imaging (HARDI) such has Q-Ball Imaging (QBI) have been introduced to overcome these limitations. QBI reconstructs the diffusion orientation distribution function (ODF), a spherical function that has its maxima agreeing with the underlying fiber populations. In this paper, we introduce the usage of the Diffusion Maps technique and show how it can be used to directly cluster set of fiber tracts, that could be obtained through a streamline tractography for instance, and how it can also help in segmenting fields of ODF images, obtained through a linear and regularized ODF estimation algorithm based on a spherical harmonics representation of the Q-Ball data. We first show the advantage of using Diffusion Maps clustering over classical methods such as N-Cuts and Laplacian Eigenmaps in both cases. In particular, our Diffusion Maps requires a smaller number of hypothesis from the input data, reduces the number of artifacts in fiber tract clustering and ODF image segmentation and automatically exhibits the number of clusters in both cases by using an adaptive scale-space parameter. We also show that our ODF Diffusion Maps clustering can reproduce published results using the diffusion tensor (DT) clustering with N-Cuts on simple synthetic images without crossings. On more complex data with crossings, we show that our ODF-based method succeeds to separate fiber bundles and crossing regions whereas the DT-based methods generate artifacts and exhibit wrong number of clusters. Finally, we illustrate the potential of our approach on a real brain dataset where we successfully segment well-known fiber bundles.
It is pretty long but worth reading.

Also, Gabriel Peyre makes available his course note and some matlab programs to help out on the topic of Data Processing Using Manifold Methods. This course seems specific to the use of dimensionality reduction technique like diffusion maps to do some manipulation of 3-D graphics. The presentation of his course reads:

This course is an introduction to the computational theory of manifolds. Manifold models arise in various area of mathematics, image processing, data mining or computer science. Surfaces of arbitrary dimension can be used to model non-linear datasets that one encounters in modern data processing. Numerical methods allow to exploit this geometric non-linear prior in order to extract relevant information from the data. These methods include in particular local differential computations (related to the Laplacian operator and its variants) and global distance methods (related to geodesic computations). In this course, you will learn how to perform differential and geodesic computations on images, volumes, surfaces and high dimensional graphs.

The course includes a set of Matlab experiments. These experiments give an overview of various tasks in computer vision, image processing, learning theory and mesh processing. This includes computation of shortest paths, Voronoi segmentations, geodesic Delaunay triangulations, surface flattening, dimensionality reduction and mesh processing.

One can notice that the flattening of meshes has a parallel with methods like the MVU, except that in the MVU case, additional interdistance constraints have to be met.

Then again in CG, people are ok with "it's good enough" :-)

## Tuesday, December 18, 2007

### Monday Morning Algorithm 7: Diffusion Maps for Dimensionality Reduction and Manifold Parametrization.

There aren't that many dimensionality reduction approaches that also try to provide some parametrization of the manifold. The diffusion map approach is one of them. It tries to approximate a Laplace-Beltrami operator on the manifold. This is computed through the algorithm of page 33 of Stephane Lafon's dissertation [1]. Extensions can be found here and interactive animated examples can be found here. For today's algorithm, we will use images of Buzz Lightyear rotating around a single axis. The file is here. Obviously this is an unoptimized code. For better implementation of diffusion maps and other dimensionality reduction technique, this site is a good choice .

What do we do here ? we first compute the inter distance between each frame, then plug this into a kernel which eventually provides an idea of how connected each frame is to each other. Using a thermal heat transfer analogy, a frame is a node and the new distance between any one node to another one is inversely proportional the number of nodes between the two nodes. The new metric induced by this new distance enables a new parametrization of the manifold. In our case, the manifold is made of one object rotating around one axis. Parametrization of the underlying manifold should yield a one parameter family since only one dimension is being varied between each frame (one rotation). In order to show this, we use the first eigenfunction of the Laplace Beltrami and sort each frame according to its projection on that eigenfunction. The sorting along that dimension permits one to sort Buzz Lightyear rotating continuously from one end to the other.

clear
% Stephane Lafon, “Diffusion maps and geometric harmonics”,
% Ph.D. dissertation, Yale University (May
% 2004)
% written by Cable Kurwitz and Igor Carron
%
% get all the files in jpg
d = dir('buzz*.jpg')
ntotal = size(d,1)
for i=1:ntotal
d(i).name
end
s=size(Img,1);
w=size(Img,2);
hd=size(Img,3);
swhd=s*w*hd;
%reshape these images into one vector each
for ii=1:ntotal
b1(:,ii) = double(reshape(Img, swhd,1));
end
%
% Compressed sensing part
%sa=randn(20,swhd);
%b=sa*b1;
b=b1;
%
%
for i=1:ntotal
for j=1:ntotal
D1(i,j)=norm(b(:,i)-b(:,j),2);
end
end
%
n1=ntotal;
D_sum=0;
for l = 1:n1;
D_sum = D_sum + min(nonzeros(D1(l,:)));
end;
epsilon = D_sum/n1/10; %
%
% Step 1
%
kernel_1 = exp(-(D1./epsilon).^1);
%
% Step 2
%
one_V = ones(1,n1);
%
% Step 3
%
p = kernel_1*one_V';
kernel_2 = kernel_1./((p*p').^1);
%
% Step 4
%
v = sqrt(kernel_2*one_V');
%
% Step 5
%
K = kernel_2./(v*v');
%
% Step 6
%
%% Singular Value Decomposition
[U S V] = svd(K);
% Normalization
for i=1:ntotal-1
phi(:,i) = U(:,i)./U(:,1); %
end
eig_funct = phi(:,1:6);
%
% The eigenvalues of \delta are approximated by those of K, and its
% eigenfunctions Ái are approximated by phi(:,i) = U(:; i):=U(:; 1)
%
vv=eig(K);

% Initial order of Buzz images
figure(1)
for i=1:ntotal
subplot(3,4,i)
imshow(d(i).name)
end

X = vv(ntotal-1).*phi(:,2);
% sorting the Buzz images using the first
% eigenfunctions the Laplace Beltrami operator
[Y,I] = sort(X);
% d(I).name
%
figure(2)
for i=1:ntotal
subplot(3,4,i)
imshow(d(I(i)).name)
end
%
figure(3)
plot(vv(ntotal-1).*phi(:,2),'.')
title(' Buzz Lightyear')
xlabel('Index')
ylabel('Lambda_1 phi_1')
%
figure(4)
plot(vv(ntotal-1).*phi(:,2),vv(ntotal-2).*phi(:,3),'.')
title(' Buzz Lightyear')
xlabel('Lambda_1 phi_1')
ylabel('Lambda_2 phi_2')
%
figure(5)
plot(eig(K),'*')
xlabel('Index of eigenvalue')
ylabel('Value of eigenvalue')
title(' Eingenvalues of K')
vv=eig(K);
%

Please note the ability to test the same algorithm by reducing first the frames through random projections (one needs to delete some % ). It is pretty obvious that one would need a robust metric in the first place instead of L2, one could foresee the use of a specifically designed metric (Improving Embeddings by Flexible Exploitation of Side Information) on top of the reduction operated with the random projections. The specifically designed metric would alleviate the need to come up with the ad-hoc epsilon of this method.

The diffusion mapping has been taken to another level by Mauro Maggioni with diffusion wavelets and Sridhar Mahadevan with Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions.

Reference: [1] S. Lafon, “Diffusion maps and geometric harmonics”, Ph.D. dissertation, Yale University (May 2004)

## Thursday, December 13, 2007

### Compressed Sensing: Single-Pixel Imaging via Compressive Sampling, Stable Manifold Embedding

In an upcoming issue of the IEEE Signal Processing Magazine, we will be treated with a Special Issue on Compressive Sampling. It is scheduled to come out in March 2008. In it, is a new well written review of the trade studies performed on the current single pixel camera development using Compressed Sensing at Rice. The review is Single-Pixel Imaging via Compressive Sampling and was written by Marco Duarte, Mark Davenport, Dharmpal Takhar, Jason Laska, Ting Sun, Kevin Kelly, and Richard Baraniuk. The abstract reads:
Humans are visual animals, and imaging sensors that extend our reach – cameras – have improved dramatically in recent times thanks to the introduction of CCD and CMOS digital technology. Consumer digital cameras in the mega-pixel range are now ubiquitous thanks to the happy coincidence that the semiconductor material of choice for large-scale electronics integration (silicon) also happens to readily convert photons at visual wavelengths into electrons. On the contrary, imaging at wavelengths where silicon is blind is considerably more complicated, bulky, and expensive. Thus, for comparable resolution, a $500 digital camera for the visible becomes a$50,000 camera for the infrared. In this paper, we present a new approach to building simpler, smaller, and cheaper digital cameras that can operate efficiently across a much broader spectral range than conventional silicon-based cameras. Our approach fuses a new camera architecture based on a digital micromirror device with the new mathematical theory and algorithms of compressive sampling . CS combines sampling and compression into a single nonadaptive linear measurement process [1–4]. Rather than measuring pixel samples of the scene under view, we measure inner products between the scene and a set of test functions. Interestingly, random test functions play a key role, making each measurement a random sum of pixel values taken across the entire image. When the scene under view is compressible by an algorithm like JPEG or JPEG2000, the CS theory enables us to stably reconstruct an image of the scene from fewer measurements than the number of reconstructed pixels. In this manner we achieve sub-Nyquist image acquisition. Our “single-pixel” CS camera architecture is basically an optical computer (comprising a DMD, two lenses, a single photon detector, and an analog-to-digital (A/D) converter) that computes random linear measurements of the scene under view. The image is then recovered or processed from the measurements by a digital computer. The camera design reduces the required size, complexity, and cost of the photon detector array down to a single unit, which enables the use of exotic detectors that would be impossible in a conventional digital camera. The random CS measurements also enable a tradeoff between space and time during image acquisition. Finally, since the camera compresses as it images, it has the capability to efficiently and scalably handle high-dimensional data sets from applications like video and hyperspectral imaging. This article is organized as follows. After describing the hardware, theory, and algorithms of the single-pixel camera in detail, we analyze its theoretical and practical performance and compare it to more conventional cameras based on pixel arrays and raster scanning. We also explain how the camera is information scalable in that its random measurements can be used to directly perform simple image processing tasks, such as target classification, without first reconstructing the underlying imagery. We conclude with a review of related camera architectures and a discussion of ongoing and future work.
When I last explained the single pixel camera, I didn't realize the first option is also called the raster scan. The noteworthy result of the trade study compares different current scanning capabilities and that provided by compressed sensing.They look at dynamic range, quantization error, photon counting noise and remind us of the fact that with CS, one can obtain much more light on the sensitive detector (pixel) than conventional cameras and that it should help in low light conditions as presented in the IMA presentation (Multiscale Geometric Analysis).

X-rays could be implemented the same way if the DMD would be able to reflect at these wavelengths, for instance by using a well thought multilayer system for instance. They also give a good summary of the other hardware technologies implementing Compressed Sensing. It looks like I have not missed any (part 1, part 2, part 3, part 4)

In another good summary of the current investigations of Richard Baraniuk's group, I found the following slide most illuminating (as presented in Single-Pixel Imaging via Compressive Sampling,) explaining the geometric reason why L1 works

The same figure could be used to explained the relative better ability of non-convex Lp regularization techniques to do even better than L1. Because the Lp (for p less than 1) ball shape tend to be squashed compared to the L1 ball, it is easier to see that there are more randomly oriented N-M planes anchored on a K-face of the Lp ball which would not interesect with the ball. [Update: Muthu MuthuKrishnan just posted a nice picture of an Lp ball, the picture is due to Petros Boufounos ].

With regards to extension of compressed sensing or compressed sensing ideas, Richard Baraniuk and Mike Wakin [1] have these three illuminating slides where they shows how Random projections can reduce Manifolds of low dimension in high dimensional space or how the condition for which random projection can be used as dimensionality reduction techniques. It shows how CS provides better bounds than the older Whitney's embedding theorem.

The current drawback of this approach (random projections) is the inability to provide some type of parametrization on the manifold. Never say never.

Source: Rice Compressed Sensing Repository.
Reference : [1] R. G. Baraniuk and M. B. Wakin, Random Projections of Smooth Manifolds, to appear in Foundations of Computational Mathematics.

## Tuesday, December 11, 2007

### Complex Systems that Matter: an unfortunate Nuclear reactor shutdown and the Space Shuttle ECO malfunction

Soon after September 11th, I recall a short discussion with Alan Waltar who was the head of our department on the amount on some of the hate mail he was getting because of his nuclear engineering leadership. I recall specifically one e-mail where he was touted as a terrorist. I cannot say how extremely annoyed I was when I read it. Alan is a truly nice guy and is very passionate about a subject that is becoming increasingly relevant everyday. In America the Powerless: Facing our Nuclear Energy Dilemma he was mentioning the important issue on our dramatic reliance on Canadian and other foreign reactors to get medical isotopes ( isotopes of molybdenum, technetium ). Then I read this yesterday about a Nuclear plant shutdown brings hospital delays where it explains that because of a unduly conservative regulatory environment, the shutdown of one single Canadian ( Chalk River nuclear facility ) will delay significantly the wait for medical tests in the U.S. and other places. Let us not kid ourselves, it will have some dramatic consequences on some patients.
In line with dealing with very complex systems, here is the internal NASA e-mails from the director of Shuttle Safety at the Johnson Space Center and the Shuttle Program Manager show how they struggle to understand and address the failure of the engine cutoff (ECO) sensor malfunction (Rainer Gerhards has a more detailed view of the system) of the Space Shuttle (In NASA's lingo, the Shuttle is not just the plane, it is the plane and the boosters) that postponed last week's launch. This is not the first time we see these internal passionate discussions about such a detailed system, but as usual, I am always left with the bitter taste that if NASA were to provide some type of funding for what they call Red team reviews with an emphasis of funding people outside of the space business, analyses would be a lot more fundamental. This tiny investment would go a long way toward sustaining and improving the program.

If you think this blog provides a service, please support it by ordering through the Amazon - Nuit Blanche Reference Store

## Monday, December 10, 2007

### Compressed Sensing: Blog entries and website updates.

Terry Tao has an interesting entry on random matrices in Additive combinatorics and random matrices. At the same time, Muthu Muthukrishnan has an entry on Compressed Sensing, entitled: Compressed Sensing Enriched. Also, Rick Chartrand just tells me the article mentioned in the Restricted Isometry Properties and Nonconvex Compressive Sensing entry is now available here. Michael Grant, Stephen Boyd, and Yinyu Ye released a new build (Version 1.1, Build 577) for CVX ( Matlab Software for Disciplined Convex Programming), they also seem to have a new introduction page providing a good explanation of what the software does to newcomers.

And remember kids, Cassini will flyby Titan on Dec. 20 or ten days from now. It will fly above Titan at an altitude of 970 km above it. Not the 1 million km of this photo:

Photo credit: NASA/JPL/Space Science Institute

### Monday Morning Algorithm Part 6: The Fast Johnson-Lindenstrauss Transform

Last year, Nir Ailon and Bernard Chazelle described a new algorithm that improves the ability to perform the Johnson-Lindenstrauss transform. It is presented in Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform (slides (ppt) are here). The summary can be described in the Fast Johnson-Lindenstrauss Transform Lemma:

Fix any set X of n vectors in R^d, epsilon < 1, and p is either 1 or 2. Let phi be the FJLT. With probability at least 2/3, we have for all x element of X: (1 + epsilon) * alpha_p * ||x||_2 < || Phi x||_p < (1 + epsilon) * alpha_p * ||x||_2 This dimension reduction technique improves a classic algorithm by Johnson and Lindenstrauss from the mid 80’s. Their technique is the first to offer an asymptotic improvement, and has already been used in design of efficient algorithms for nearest neighbor searching and high dimensional linear algebraic numerical computations. A better technique has recently surfaced as well. A good description of this new technique can be found in this presentation by Edo Liberty.

I have implemented a version of it by following the instructions on the paper but I am not convinced I have done the right implementation as the results are not so sharp to me. I haven't had much time to debug it so here it is, if you find a mistake please let me know:

clear
% Fast Johnson-Lindenstrauss Transform
% as described by Nir Ailon and Bernard Chazelle
% in their paper:
% "Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss
% Transform"
% The algorithm uses the PHD transform of paragraph 2
% The transform is a k x d projection
% for every set S of n points in R^d
% Let us define an example case
% All columns of G are sets of points
% i.e. we have 1000 points of dimension 128
n = 1000;
d = 256;
% let us examine a reduction from d = 256 to k = 150
%
k = 150;
G = 180 * rand(d,n);
% k = O(epsilon^(-2) * log(n))
epsilon = sqrt(log(n)/k)
%
% epsilon is really O( sqrt(log(n)/k) )
%
%
% Projection in dim k << d % let us take d = 2^w w = log2(d); % % % Defining P (k x d) % first define q % q = min((log(n))^2/d,1); % P = rand(k,d); P = (P < q); P1 = 1/q * randn(k,d); P = P1 .* P; % % Defining H (d x d) H = [1]; for i=1:w H = [ H H;H -H]; end H = 1/d^(0.5)* H; % % Defining D ( d x d) vec1=rand(d,1); v1 = ((vec1 > 0.5) - 1*(vec1<=0.5)); D = diag(v1); % % % FJLT = P * H * D; u = FJLT * G(:,5); v = FJLT * G(:,36); norm(G(:,5)-G(:,36))^2 * k * (1-epsilon) norm( u - v)^2 norm(G(:,5)-G(:,36))^2 * k * (1+epsilon)

## Thursday, December 06, 2007

### Compressed Sensing: Neighborly polytopes, Acoustic Tomography, Sampling in union of linear subspaces, ROMP stability

Suresh Venkatasubramanian posts about polytopes, compressed sensing, and lower bounds for convex hulls in Neighborly polytopes and compressed sensing. He has an interesting wiki from his seminar course.

Recently, I mentioned a paper on Efficient and Stable Acoustic Tomography Using Sparse Reconstruction Methods by Ivan Jovanović, Ali Hormati, Luciano Sbaiz and Martin Vetterli. It is now here. They solve an inverse problem finding a temperature profile by an inference from the speed of sound. In effect they solve an integral equation like Richard Baraniuk and Philippe Steeghs in Compressive radar imaging and Lawrence Carin, Dehong Liu and Ya Xue in In Situ Compressive Sensing

From the Rice repository, two new items:

Thomas Blumensath and Mike Davies, Sampling theorems for signals from the union of linear subspaces.
Compressed sensing is an emerging signal acquisition technique that enables signals to be sampled well below the Nyquist rate, given that the signal has a sparse representation in an orthonormal basis. In fact, sparsity in an orthonormal basis is only one possible signal model that allows for sampling strategies below the Nyquist rate. In this paper we consider a more general signal model and assume signals that live on or close to the union of linear subspaces of low dimension. We present sampling theorems for this model that are in the same spirit as the Nyquist-Shannon sampling theorem in that they connect the number of required samples to certain model parameters. Contrary to the Nyquist-Shannon sampling theorem, which gives a necessary and sufficient condition for the number of required samples as well as a simple linear algorithm for signal reconstruction, the model studied here is more complex. We therefore concentrate on two aspects of the signal model, the existence of one to one maps to lower dimensional observation spaces and the smoothness of the inverse map. We show that almost all linear maps are one to one when the observation space is at least twice the dimension of the largest subspace in the union of subspaces. However, we also show that in order for the inverse map to have certain smoothness properties such as a given finite Lipschitz constant, the required observation dimension necessarily depends on the number of subspaces in the signal model. These results are then applied to two examples, the standard compressed sensing signal model in which the signal has a sparse representation in an orthonormal basis and to a sparse signal model with additional tree structure.

and Nam H. Nguyen and Trac D. Tran, The stability of regularized orthogonal matching pursuit. The abstract reads:
This paper studies a fundamental problem that arises in sparse representation and compressed sensing community: can greedy algorithms give us a stable recovery from incomplete and contaminated observations ? Using the Regularized Orthogonal Matching Pursuit (ROMP) algorithm, a modified version of Orthogonal Matching Pursuit (OMP) [1], which was recently introduced by Deanna Needell and Roman Vershynin [ref: Signal recovery from incomplete and inaccurate measurements via Regularized Orthogonal Matching Pursuit ], we assert that ROMP is stable and guarantees approximate recovery of non-sparse signals, as good as the Basis Pursuit algorithm [16]. We also will set up criterions at which the algorithm halts, and the upper bounds for the reconstructed error. It will be proved in the paper that these upper bounds are proportional to the noise energy.

Credit Photo: NASA/JPL/Space Science Institute, Saturn moon Tethys taken on October 29, 2007.

## Wednesday, December 05, 2007

### Compressed Sensing: Solving the Basis Pursuit problem with a Bregman Iterative Algorithm

While TwIST is solving (IST) compressed sensing reconstruction problems with a two-step Iterative shrinkage/thresholding (IST) algorithm for different regularization terms. Wotao Yin, Stanley Osher, Donald Goldfarb and Jerome Darbon are using Bregman iterative regularization to attack this Basis Pursuit problem in Bregman Iterative Algorithms for Compressed Sensing and Related Problems. The abstract reads:
We propose simple and extremely efficient methods for solving the Basis Pursuit problem

which is used in compressed sensing. Our methods are based on Bregman iterative regularization and they give a very accurate solution after solving only a very small number of instances of the unconstrained problem

for given matrix A and vector fk. We show analytically that this iterative approach yields exact solutions in a finite number of steps, and present numerical results that demonstrate that as few as two to six iterations are sufficient in most cases. Our approach is especially useful for many compressed sensing applications where matrix-vector operations involving A and A' can be computed by fast transforms. Utilizing a fast fixed-point continuation solver that is solely based on such operations for solving the above unconstrained sub-problem, we were able to solve huge instances of compressed sensing problems quickly on a standard PC.

They also seem to aim for very large reconstruction problems as well. The webpage for this effort is here and the code is here. The main difference would certainly be in how much time is spent on both algorithms to solve the same problem but also the fact that the TwIST would have the advantage of also dealing with non-convex regularization lp terms with p =0.7 or 0.8. If anybody has done the comparison between the two methods with p=1, I'd love to hear about it.

Source: Rice Compressed Sensing page.

### Compressed Sensing: A new TwIST

After introducing us to the very fast GPSR (new version 5.0), Mário Figueiredo is at it again. José Bioucas-Dias and Mário Figueiredo introduces us to TwIST (Two-step Iterative Shrinkage/Thresholding Algorithm for Linear Inverse Problems) that seems to be both speedy and capable of handling very large unknown vectors found when dealing with images. The main page has examples that include compressed sensing reconstruction examples.

References that explain the algorithms are A new TwIST: two-step iterative shrinkage/thresholding algorithms for image restoration and Two-step algorithms for linear inverse problems with non-quadratic regularization" both by José Bioucas-Dias, Mário Figueiredo. MATLAB code is available here.

Source: Rice Compressed Sensing page.

## Tuesday, December 04, 2007

### Compressed Sensing: Random Undersampling in Geophysics and ROMP, a stable greedy algorithm

Gilles Hennenfent and Felix Herrmann just came out with a new paper on the use of Compressed Sensing and Geophysics entitled: Simply denoise: wavefield reconstruction via jittered undersampling. The abstract reads:

In this paper, we present a new discrete undersampling scheme designed to favor wavefield reconstruction by sparsity-promoting inversion with transform elements that are localized in the Fourier domain. Our work is motivated by empirical observations in the seismic community, corroborated by recent results from compressive sampling, which indicate favorable (wavefield) reconstructions from random as opposed to regular undersampling. As predicted by theory, random undersampling renders coherent aliases into harmless incoherent random noise, effectively turning the interpolation problem into a much simpler denoising problem.

A practical requirement of wavefield reconstruction with localized sparsifying transforms is the control on the maximum gap size. Unfortunately, random undersampling does not provide such a control and the main purpose of this paper is to introduce a sampling scheme, coined jittered undersampling, that shares the benefits of random sampling, while offering control on the maximum gap size. Our contribution of jittered sub-Nyquist sampling proofs to be key in the formulation of a versatile wavefield sparsity-promoting recovery scheme that follows the principles of compressive sampling.

After studying the behavior of the jittered-undersampling scheme in the Fourier domain, its performance is studied for curvelet recovery by sparsity-promoting inversion (CRSI). Our findings on synthetic and real seismic data indicate an improvement of several decibels over recovery from regularly-undersampled data for the same amount of data collected.

Deanna Needell and Roman Vershynin just made available a paper entitled:Signal recovery from incomplete and inaccurate measurements via Regularized Orthogonal Matching Pursuit. Roman Vershynin writes the following commentary on it:

This is our second paper with my student Deanna Needell, where we continue to study alternatives to linear programming methods in Compressed Sensing. In our first paper, we developed a greedy algorihtm called ROMP, which can perfectly recover any sparse signal from a small number of linear non-adaptive measurements. In this paper, we prove the stability of ROMP. It seems to be the first greedy algorithm in Compressed Sensing that is provably stable in the sense that the recovery error is proportional to the level of the noise. In particular, the recovery is exact in the absence of noise. Also, one can use ROMP to accurately recover signals, not necessarily sparse. This kind of stability was known for the linear programming methods (a result of Candes, Romberg and Tao), but has been previously unknown for any greedy method (like OMP).

ROMP is available here. Let us note that like the reweighted Lp algorithm of Rick Chartrand and Wotao Yin, reconstruction is performed by a series of iterations solving a Least Squares problem.

## Sunday, December 02, 2007

### Monday Morning Algorithm part 5: The 1-D Experimental Probabilistic Hypersurface

The Experimental Probabilistic Hypersurface is a tool initially designed by Bernard Beauzamy. Let me first try to say why it is useful first. Imagine you have a function of several variables, say 100, and it takes 2 hours to evaluate that function. At the end of his/her 8 hours day, the engineer will have processed at best 4 function evaluations. It will take a long time of doing any type of sensitivity studies.

In order to have a faster turn around, one could foresee a scheme where one would do a linear interpolation between points (of dimension 100) that we already know. This would provide an idea of what the function looks like in a larger region. Unfortunately, this interpolation scheme in dimension 100 would by itself take too much time. Irrespective to this time constraint, there is nothing that says this function should be linear or even continuous.

In 2004, Bernard Beauzamy made this original construction whereby the function to be evaluated is a multidimensional probabilistic distribution. For every set of (100) parameters that have been already evaluated, the result of the EPH for this function is a dirac. For all other sets of parameters that have not been evaluated, the result of the EPH for this function is a distribution whose shape will be as peaky (resp. flat) as one is close (resp. far) from the sets of parameters for which the function has already been evaluated. Olga Zeydina, a Ph.D candidate, is writing her thesis on the subject. Most information on this construction can be found in the document section of the Robust Mathematical Modeling program at SCM. Of particular interest are: the initial practical implementation [ Report 1 ] and a newer implementation [Report 2].

Today's algorithm will be featured in two files. One that implements the EPH, the other using the EPH function and produce graphs from this document. Listed in this entry, I will list the EPH function file. The second file can be found in the MMA repository.

function [t,pxx]=EPH(xi,theta,x,t1,tau,xe)
% Algorithm developed by Bernard Beauzamy
% and Olga Zeydina
%
% This implementation was written by
% Igor Carron
%
% Inputs
% xi: a parameter set where the function
% has been evaluated
%
% theta: the result of the
% function evaluation for parameter set xi
%
% x: bounding value for parameter set xi
%
% t1: bounding value for result of
% function evaluation
%
% tau: is the increment in the interval
% defined by the bounding value of t1
%
%
% xe: Query parameter set. We want to
% know the value of the function at
% this point.
%
% Outputs
%
% t: interval where the distribution is
% defined
%
% pxx: probability distribution sought
% sampled over t.
%
N =size(xi,2)
%
x1_min= x(1,1)
x1_max= x(1,2)
%
t_min= t1(1,1);
t_max= t1(1,2);
%
%
nu=size([t_min:tau:t_max],2)-1
t=[t_min:tau:t_max];
%
dmax(1)=max(abs(x(1,1)-xi(1)),abs(x(1,2)-xi(1)));
lambd(1)=log(nu)/dmax(1);
for i=2:N
dmax(i)=max(abs(x(1,1)-xi(i)),abs(x(1,2)-xi(i)));
if dmax(i)>dmax(i-1)
lambd(i)=log(nu)/dmax(i);
else
lambd(i)=lambd(i-1);
end
end
nxe=size(xe,2);
options = optimset('TolFun',1e-3);
for jxx=1:nxe
for i=1:N
d(i)=abs(xe(jxx)-xi(i));
end
for i=1:N
a =pi/tau.^2*exp(1-2*lambd(i)*d(i));
lamn=lambd(i)*d(i);
at2=sum(exp(-a.*t.^2));
at3=sum(t.^2.*exp(-a.*t.^2));
diffe=abs(at2-1/tau*(pi/a).^(1/2))/(1/tau*(pi/a).^(1/2));
if abs(diffe)<1e-10
areal(i) = a;
else
areal(i) = fzero(@(x) fina(x,t,lamn),a,options);
end
b = 1/2-lambd(i)*d(i);
breal(i) = log(1/(sum(exp(-areal(i).*t.^2))));
aa(i)=areal(i);
bb(i)=breal(i);
end
for i=1:N
pro(i,:)=d;
end
prd=cumprod(d,2);
for i=1:N
pro(i,i)=1.0;
end
xpro=cumprod(pro,2);
pioverdi=xpro(:,N);
d11s=sum(pioverdi);
for j=1:nu+1
for i=1:N
pij1(i,j)=exp(-aa(i)*(t(j)-theta(i)).^2 + bb(i));
end
pxx(jxx,j)=1/d11s*(pioverdi'*pij1(:,j));
end
end

To run this function, one needs to run the MMA5.m file. This is a simple 1-D implementation. One can obviously foresee its application to 2-D or n-D functions. But more important, one could use the resident expert knowledge of the engineer and use the algorithm of last week [1] to produce a new metric in the parameter space and then use the EPH. If you are using it for some serious work, you may want to join us in the Robust Mathematical Modeling program. This probabilistic approach to function evaluation can also be used as a storage mechanism. Another issue that most engineers face is that after so many function evaluations, it generally is difficult to summarize all the findings. This approach solves this problem. One can even implement a way for the engineer to add parameters as he/she goes.