Wednesday, July 02, 2014

Entropic Uncertainty Relations in Quantum Physics / On asymptotic structure in compressed sensing

I am having a small discussion with Greg Howland one of the authors of yesterday's paper ( Simultaneous Measurement of Complementary Observables with Compressive Sensing )

....[Everything that is about to be said here is based on my own current limited understanding]....

One item he points me to is the following paper on the use of Shannon entropy in Quantum Mechanics. To make a long story short, some of the readers might recognize issues in this example ( see figure below ) and its attendant conclusion ( see below):

...Therefore, σx tends to infinity with N → ∞, while common sense predicts that the uncertainty should remain finite. So, why the uncertainty measured by the standard deviation grows with N? It is simply the effect of a growing distance between the two regions. Standard deviation is based on the second moment which is very sensitive to the values of an observable lying far away from the average. This is the first flaw of the standard deviation...

For short, the Heisenberg principle uses standard deviations and it is not helpful because it is based on the L2 norm and is therefore sensitive to outliers. As an outsider, I just hope that all the Quantum Mechanics "paradoxes" are of the same nature. Hence according to the paper, the Heisenberg principle can be recasted using a Shannon entropy argument where it makes more common sense. But while I am reading the rest of the paper, I get the sense that if you use those Shannon/Renyi entropies in Quantum Mechanics, we directly go back to traditional issues that have been studied in compressive sensing:

The physical interpretation of the inequality (79) is similar to the Heisenberg uncertainty relation. Two observables A and B, characterized by the bases |aii and |bji cannot have simultaneously sharp values since they do not have common basis vectors. Moreover, every state described by a basis vector has the sum of the uncertainties equal to the same value 1/√N. The observables A and B are finite dimensional analogs of canonically conjugate physical quantities.
The observables or the measurements cannot have similarly sharp values (the main argument of the Heisenberg principle) is just a statement on basis incoherence seen in compressive sensing. If you recall this Ghost imaging experiment back in 2009, you'll notice that one could already reconstruct the image using equation 2 of this paper without any assumption on the sparsity of the scene. In other words, while the measurements are sharp for one variable (raster scan) and "diffuse" for the other (random group or CS scan), one can already reconstruct a sharp/sparse variable from the "diffuse" measurement thanks to the randomness of the filter alone.

What does Compressive Sensing bring to the table ? 

Well, the additional assumption of sparsity on what is being measured, allows fewer "diffuse" measurements to be deconvoluted back to the sparse (sharp) value of the variable of interest. This was well stated by Emmanuel Candes and Terry Tao back in 2008 in a presentation entitled The uniform uncertainty principle and compressed sensing:

(Note that one normally needs all p Fourier coefficients in order to recover a general signal; the point is that sparse signals have a much lower information entropy and thus are easier to recover than general signals.)
(For more details, please check Terry Tao's 2007 blog entry and attendant comments)

Let us note that within this description, incoherence in Quantum Mechanics and incoherence in Compressive Sensing have identical meanings. Let us also note that the assumption of sparsity is strong in that one does not require random filters to reconstruct the variable of interest (there are deterministic measurement ensembles or some other structurally random measurement ensembles that can be used in compressive sensing). Random filters are, however, required to reconstruct a variable of interest without the sparsity assumption.

Of note, the field has evolved ever since the earliest papers on compressive sensing and there has been an on-going discussion and progress on this issue of incoherence. As a matter of fact, we seem to converge toward a better understanding of it [1] while other approaches that do not rely on incoherence properties are also making strides by showing how sharp phase transitions produce a limit on actual hardware/measurement systems. Let us hope that this on-going debate can eventually include the Quantum Mechanics and Quantum Optics folks. For instance, in QM, the uncertainty principle is also equivalent to the fact that two operators do not commute. Can we design specific measurement matrices based on the diverse findings in compressive sensing and group testing to get a good evaluation of both variables ? How about using sharp phase transitions in order to evaluate the limits of states that can be probed with compressive sensing solvers. The Renyi entropy proposed in the paper has been looked into in compressive sensing [2] and attendant solvers have been found that are optimal for that [3] (AMP solvers). If this interpretation holds, then suddenly QM could become much clearer to a larger swath of the research community. Without further ado, here is the paper:

Entropic Uncertainty Relations in Quantum Physics by Iwo Bialynicki-Birula, Lukasz Rudnicki

Uncertainty relations have become the trademark of quantum theory since they were formulated by Bohr and Heisenberg. This review covers various generalizations and extensions of the uncertainty relations in quantum theory that involve the R\'enyi and the Shannon entropies. The advantages of these entropic uncertainty relations are pointed out and their more direct connection to the observed phenomena is emphasized. Several remaining open problems are mentioned.

[1] On asymptotic structure in compressed sensing by Bogdan Roman, Anders Hansen, Ben Adcock

This paper demonstrates how new principles of compressed sensing, namely asymptotic incoherence, asymptotic sparsity and multilevel sampling, can be utilised to better understand underlying phenomena in practical compressed sensing and improve results in real-world applications. The contribution of the paper is fourfold:
First, it explains how the sampling strategy depends not only on the signal sparsity but also on its structure, and shows how to design effective sampling strategies utilising this.
Second, it demonstrates that the optimal sampling strategy and the efficiency of compressed sensing also depends on the resolution of the problem, and shows how this phenomenon markedly affects compressed sensing results and how to exploit it.
Third, as the new framework also fits analog (infinite dimensional) models that govern many inverse problems in practice, the paper describes how it can be used to yield substantial improvements.
Fourth, by using multilevel sampling, which exploits the structure of the signal, the paper explains how one can outperform random Gaussian/Bernoulli sampling even when the classical l1 recovery algorithm is replaced by modified algorithms which aim to exploit structure such as model based or Bayesian compressed sensing or approximate message passaging. This final observation raises the question whether universality is desirable even when such matrices are applicable.
Examples of practical applications investigated in this paper include Magnetic Resonance Imaging (MRI), Electron Microscopy (EM), Compressive Imaging (CI) and Fluorescence Microscopy (FM). For the latter, a new compressed sensing approach is also presented.

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.


BR said...

Hi Igor,
To add some complementary info to your nice post:
you may be interested in our study of the uncertainty principle in signal processing (and the references we cite therein). One is a review of some versions of the uncertainty principles:
"A survey of uncertainty principles and some signal processing applications"
We explain the relation between l^p-norms and Shannon entropy via the Renyi entropies (already noticed in the paper you cite) and give some other insights. You are right, these quantities give a family of uncertainty principles which are different from the Heisenberg one in that they do not take into account any localization of a function around a particular value (its mean position). Indeed, if you change the labelling of points in the function domain, the norm do not change (l^p norm or entropy will stay the same), but the variance of the function will change.
We also wrote a paper on uncertainty in 2 different frames, a generalization of the ones using 2 different bases:
"Refined support and entropic uncertainty inequalities"
The bound in the uncertainty inequality depends on the coherence between the two frames. the same coherence which is also involved in compressive sensing. We also propose a slightly different definition for the coherence.
I hope this will help understanding better the uncertainty principle.
We also prepare a study of some uncertainty principles applying to signal defined on graphs for the following weeks! I send you a mail when it is ready.
Thanks for sharing your views on the uncertainty principle.

thomas said...

Rényi's information dimension is also used in this very recent paper on a universal approach to compressed sensing: which also cites [2].

BR said...

A few additional facts I want to mention on the uncertainty principle, that I was happy to learn.

If we look at the uncertainty principle between the signal and its Fourier transform. The minimizer of the Heisenberg uncertainty principle (the function which makes the uncertainty equal to the bound) is a Gaussian function, and it is the same for the entropic uncertainty principle in infinite dimension (continuous case). In the finite dimensional case, the minimizer of the l^p-norm family of uncertainty principle (in particular l^0-norm) is a « picket fence signal » , i.e. regularly spaced Kronecker deltas. A striking result is that the Fourier transform of a Gaussian is a Gaussian and the (discrete) Fourier transform of a picket fence is a picket fence. Also an important thing to remark is the difference between continuous and discrete domains. One minimizer is well localized around a mean value whereas the other is completely delocalized, with a few peaks of energy but evenly distributed in the signal domain. If you make a function which approximate well a gaussian in the discrete case, you will have a small uncertainty, but the picket fence beats it. In the continuous case, the picket fence can not exists as it is not square integrable.

Continuing with the continuous vs discrete case, it is difficult to find equivalent of the Heisenberg and its variance measure in the discrete case. You have to adapt the definition to each domain (if it is possible). Even in the continuous case, take functions living on a circle: if you choose a point of the circle to be your zero coordinate and you define your domain to be [-pi,pi] with periodic boundary conditions. If you translate a nice well localized function to a point around pi. A part of the function will appear around -pi because we are on a circle. If you compute the standard variance as defined for (-infinity,infinity), it will give a huge value, but on the circle the function is still the same, with the same concentration.
I did not find any general formula for the variance in the discrete case.
Hence, we have to be careful when comparing uncertainty in the continuous and the discrete case.

The uncertainty principle exists between two representations which are the expressions of the function in two different bases of a Hilbert space. But it exists also for two frames (at least in the finite dimensional case), meaning that you have an uncertainty principle each time you use two linear transformations which are invertible (even if you have to use a pseudo-inverse). That means, the uncertainty principle is related also to information. If you loose some information by looking at your function in some representation you don’t have any uncertainty principle (the bound is zero).