Thursday, March 07, 2013

Nuit Blanche Reader's Reviews: SPASH, R-Functions, Structure in Biology, Randomized Kaczmarz and SDD, AMP and GAMP

In the spirit of Reddit, Google+, LinkedIn as Peer Review Systems, here are various recent interactions of note (the previous ones are here):

Following up on SPASH: Sparse Shape Reconstruction, Alireza Aghasi responded with: 

Dear Igor,
Thanks for your note and interest in the paper.
In fact at some point in my PhD when I was working on a parametric level set representation, I was inspired by some works of Vadim Shapiro to use R-functions for such purpose. My main concern at that point was ending up with a cost function that has many local minima points as opposed to a simple basis expansion expression that does not by itself add additional complexity to the cost function. In composition of two R-functions, let say f_1 and f_2 where each represent a shape by their sign change, we have for instance f_1 & f2 = f_1+f_2-sqrt(f_1^2+f_2^2) and f_1 | f2 = f_1+f_2+sqrt(f_1^2+f_2^2). This type of composition when we have many shapes (e.g. in order of 30000) would make the problem highly capable of getting trapped into a local minima.
The idea behind using Knolls was to keep the problem as simple as possible while still being able to explore the shape composition idea. Also in composing Knolls the "sharpness" of the boundaries is preserved while composition of R-functions keeps smoothening the boundaries at the c-level set.
I hope this explains the intuition behind the method. I would always be more than happy to read your further comments and suggestions.
Thanks Ali ! I'll have to think about what those R-functions could actually bring to the table.

As pointed out yesterdayMohammed AlQuraishi and I had a small discussion on analysis operator and their potential use in structural biology in the comment section on his blog entry entitled: CASP10, and the Future of Structure in Biology

On Andrew Davison's feed, Andrew talks about a new paper solving graph laplacians, I noted that a new algorithm for computing graph laplacians was shown to be a randomized instance of the Kaczmarz algorithm (see section 9, A Simple, Combinatorial Algorithm for Solving SDD Systems in Nearly-Linear Time by Jonathan A. Kelner, Lorenzo Orecchia, Aaron Sidford, Zeyuan Allen Zhu) and wondered how the recent advances made there (Joel Tropp and Deanna Needell, see Paved with Good Intentions: Analysis of a Randomized Block Kaczmarz Method) could make that algorithm even faster ?

I want to clarify that there are some problems with the paragraph from [4] that you quoted below.  In particular, the line 
The AMP was generalized for general signal models in [5], [6] and called G-AMP. The algorithm used in [3], [7] is equivalent to G-AMP.
is incorrect.  The "G-AMP" algorithm, as introduced by Rangan in arXiv:1010.5141, is a considerable generalization of the "Bayesian AMP" algorithm proposed earlier by Donoho Maleki and Montanari at ITW 2010 (or [5] above).  G-AMP is distinct from Bayesian AMP because G-AMP handles nonlinear observations such as those in logistic regression, phase-retrieval, and reconstruction from quantized samples.  Also, G-AMP assumes less strict conditions on the homogeneity of the measurement matrix than Donoho Maleki and Montanari's algorithms.  Donoho Maleki and Montanari deserve credit for the Bayesian AMP algorithm, while Rangan deserves credit for the G-AMP algorithm, and the two should not be confused.
Lenka Zdeborova: and Phil Schniter had a discussion  on email afterwards that they kindly allowed me to make public, here is what Lenka eventually replied: 

Thanks for your comment, ...
There is one point I did not understand or do not agree. When you write: "Also, G-AMP assumes less strict conditions on the homogeneity of the measurement matrix than Donoho Maleki and Montanari's algorithms." What do you mean? I thought G-AMP is making exactly the same assumptions on the matrix as AMP was.


Phil  explained further:
Hi Lenka,
In the derivation of the original AMP, the matrix "A" was assumed to have all columns with equal 2-norm (equal to 1, in particular) and all rows with equal 2-norm (equal to sqrt(n/m), in particular). Furthermore, the variances of the coefficients in the vector estimate of "x" where assumed to be identical over the vector, and same for the variances in the estimate of "z=Ax".
In GAMP, while it is assume that the matrix is dense, the rows of the matrix A can have different 2-norms, as can the columns. Furthermore, in the vector estimate of "x", the variances can differ over the vector, and similar for "z=Ax". The cost of this generality are two additional matrix multiplies per iteration in GAMP, relative to AMP, where the matrix involved is the element-wise square of "A".
That said, GAMP has a "uniform variance" option where the variances are constrained to be uniform over the vectors, in which case there are normalizations in the algorithm that involve the squared Frobenius norm of A. This is similar, but still a bit more general than the original AMP, which only handles a particular scaling of A.
Empirically, what we find is that the uniform variance mode is adequate for "nice" matrices, as occur when one generates them large and i.i.d sub-Gaussian. But with not-so-nice matrices (i.e., smaller and/or not i.i.d and/or not sub-Gaussian), the uniform variance assumption can sometimes hurt.
Hope that clarifies the issue. The details are in

Lenka then responded with:
Dear Phil,
thanks for your explanations, now it is clear to me what you meant. Indeed, I agree with you that the form where variances are not assumed to be uniform, column and row norms of the matrix can be generic, etc. has its definite advantages (in many cases better convergence properties etc).
On the other hand when one derives AMP from BP one has to go trough that more general form of equations at some point, and so I am pretty sure Donoho, Maleki, Montanari were well aware of that form. And it was just their choice to present the restricted case - my guess would be they did this because it was the first step towards the state evolution equations which they then analyzed rigorously. For this reason I personally do not think about this as an advantage of G-AMP with respect to AMP.
Thanks again for the clarification.

and Phil finished that exchange with:
I totally agree GAMP's (slight) relaxation of AMP's matrix constraints is not intellectually significant. But, from a user's standpoint, I think it's worth knowing about. For example, I've heard several people complain that AMP requires the matrix to be normalized (in Frobenius norm) and the answer is: no, not really. But, still, it's not clear exactly how to do allow general matrices without going through the entire derivation, which is too much for many people to do on their own. In this sense, I think it is useful for them to know that the necessary generalizations have been derived, explained, and coded in GAMP.
In any case, I think we all agree that the significant generalization proved by GAMP is to take AMP from the linear-AWGN observation model to the generalized-linear observation model.
Thanks Phil and Lenka for making this exchange public !

Image Credit: NASA/JPL/Space Science Institute, W00079810.jpg was taken on March 06, 2013 and received on Earth March 06, 2013. The camera was pointing toward SATURN-RINGS at approximately 876,715 miles (1,410,936 kilometers) away, and the image was taken using the CL1 and CL2 filters.

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: