Nuit Blanche

Thursday, November 08, 2012

Eigenvalues of a matrix in the streaming model

Following up on this morning entry on a better embedding for least squares and SVD, here is a result for eigenvalues: Evidently all these entries are listed under the RandNLA tag: I can see a whole bunch of application for something like this. Here it is: Eigenvalues of a matrix in the streaming model by Alexandr Andoni and Huy L. Nguyen. The abstract reads:

We study the question of estimating the eigenvalues of a matrix in the streaming model, addressing a question posed in [Mut05]. We show that the eigenvalue \heavy hitters" of a matrix can be computed in a single pass. In particular, we show that the -heavy hitters (in the `1 or `2 norms) can be estimated in space proportional to 1= 2. Such a dependence on is optimal. We also show how the same techniques may give an estimate of the residual error tail of a rank-k approximation of the matrix (in the Frobenius norm), in space proportional to k 2 All our algorithms are linear and hence can support arbitrary updates to the matrix in the stream. In fact, what we show can be seen as a form of a bi-linear dimensionality reduction: if we multiply an input matrix with projection matrices on both sides, the resulting matrix preserves the top eigenvalues and the residual Frobenius norm.

Image Credit: NASA/JPL/Space Science Institute,

W00076689.jpg was taken on November 06, 2012 and received on Earth November 06, 2012. The camera was pointing toward SATURN-RINGS at approximately 1,292,973 miles (2,080,838 kilometers) away, and the image was taken using the CL1 and CL2 filters.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings

If you recall David Woodruff's video presentation at MMDS (or his slides at the recent Workshop on "Randomized Numerical Linear Algebra (RandNLA): Theory and Practice") one could use a relatively sparse projection to perform least squares and low rank computations. Well, it looks like Jelani Nelson and Huy L. Nguyen have a slightly better embeddings

OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings by Jelani Nelson and Huy L. Nguyen. The abstract reads:

An "oblivious subspace embedding (OSE)" given some parameters eps,d is a distribution D over matrices B in R^{m x n} such that for any linear subspace W in R^n with dim(W) = d it holds that Pr_{B ~ D}(forall x in W ||B x||_2 in (1 +/- eps)||x||_2) > 2/3 We show an OSE exists with m = O(d^2/eps^2) and where every B in the support of D has exactly s=1 non-zero entries per column. This improves previously best known bound in [Clarkson-Woodruff, arXiv:1207.6365]. Our quadratic dependence on d is optimal for any OSE with s=1 [Nelson-Nguyen, 2012]. We also give two OSE's, which we call Oblivious Sparse Norm-Approximating Projections (OSNAPs), that both allow the parameter settings m = \~O(d/eps^2) and s = polylog(d)/eps, or m = O(d^{1+gamma}/eps^2) and s=O(1/eps) for any constant gamma>0. This m is nearly optimal since m >= d is required simply to no non-zero vector of W lands in the kernel of B. These are the first constructions with m=o(d^2) to have s=o(d). In fact, our OSNAPs are nothing more than the sparse Johnson-Lindenstrauss matrices of [Kane-Nelson, SODA 2012]. Our analyses all yield OSE's that are sampled using either O(1)-wise or O(log d)-wise independent hash functions, which provides some efficiency advantages over previous work for turnstile streaming applications. Our main result is essentially a Bai-Yin type theorem in random matrix theory and is likely to be of independent interest: i.e. we show that for any U in R^{n x d} with orthonormal columns and random sparse B, all singular values of BU lie in [1-eps, 1+eps] with good probability.

Plugging OSNAPs into known algorithms for numerical linear algebra problems such as approximate least squares regression, low rank approximation, and approximating leverage scores implies faster algorithms for all these problems.

The description of two OSNAP mebddings are featured in page 5.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Wednesday, November 07, 2012

From Bits to Images: Inversion of Local Binary Descriptors - implementation -

This is a follow-up to this previous entry: Do Androids Recall Dreams of Electric Sheeps ? Just off the press, wow (for some context and a word from one of the co-author, Emmanuel d'Angelo, see below).

From Bits to Images: Inversion of Local Binary Descriptors by Emmanuel d'Angelo, Laurent Jacques, Alexandre Alahi, Pierre Vandergheynst. The abstract reads:

Local Binary Descriptors are becoming more and more popular for image matching tasks, especially when going mobile. While they are extensively studied in this context, their ability to carry enough information in order to infer the original image is seldom addressed.

In this work, we leverage an inverse problem approach to show that it is possible to directly reconstruct the image content from Local Binary Descriptors. This process relies on very broad assumptions besides the knowledge of the pattern of the descriptor at hand. This generalizes previous results that required either a prior learning database or non-binarized features.

Furthermore, our reconstruction scheme reveals differences in the way different Local Binary Descriptors capture and encode image information. Hence, the potential applications of our work are multiple, ranging from privacy issues caused by eavesdropping image keypoints streamed by mobile devices to the design of better descriptors through the visualization and the analysis of their geometric content.

The attendant implementation is here.

- Update - It looks like the blog entry was faster than the email sent by Emmanuel to me, here it is:

Dear Igor,

I'm pretty sure you remember our preliminary work on image reconstruction from local quantized descriptors featured on your blog: http://nuit-blanche.blogspot.ch/2012/06/do-androids-recall-dreams-of-electric.html

I told you on Twitter at the time that I would release the code only once the job is done, i.e. when we have an algorithm that works for 1-bit quantized descriptors (our previous algorithm featured in the blog post was for floating-point feature vectors only). Well, it is time :-)

We have successfully adapted the Binary Iterative Hard Thresholding of Jacques et al. to invert binarized image local descriptors and reconstruct images. The implications of this seem huge: from smart cameras streaming descriptors instead of images to privacy breaches in cloud-based pattern recognition.

The pre-print is here: http://arxiv.org/abs/1211.1265

And as promised the code is on github, so everybody is welcome to clone / fork / play with it: https://github.com/sansuiso/LBDReconstruction

Right now, I'm about to leave for the ICPR'12 conference in Tsukuba (Japan), where I will present the preliminary version of this work (the one with the real on-quantized descriptors). So, if anybody interested is also attending the conference, I'm sure we'll have time to discuss about it !

Have a nice day !

Sincerely,

Emmanuel

Thanks Emmanuel !

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Tuesday, November 06, 2012

Sketched SVD: Recovering Spectral Features from Compressive Measurements - implementation - and Randomized Matrix Computations

Fitting into the generic RandNLA movement ( I just created a tag for it), here are two papers. The first one is so simple I even hesitate to even consider it as a defacto implementation but it definitely as a similar flavor as to the previously mentioned Invariance of Principal Components under Low-Dimensional Random Projection of the Data.

It is Sketched SVD: Recovering Spectral Features from Compressive Measurements by Anna C. Gilbert, Jae Young Park, Michael B. Wakin. The abstract reads:

We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "sketch matrix" Y, and then to compute the SVD of the sketch matrix. We show that the singular values and right singular vectors of Y are close to those of X, with small relative error. We also believe that this bound is of independent interest in non-streaming and non-distributed data collection settings.

Assuming that the data matrix X is of size Nxn, then with m linear measurements of each column of X, we obtain a smaller matrix Y with dimensions mxn. If m = O(k \epsilon^{-2} (log(1/\epsilon) + log(1/\delta)), where k denotes the rank of X, then with probability at least 1-\delta, the singular values \sigma'_j of Y satisfy the following relative error result

(1-\epsilon)^(1/2)<= \sigma'_j/\sigma_j <= (1 + \epsilon)^(1/2) as compared to the singular values \sigma_j of the original matrix X. Furthermore, the right singular vectors v'_j of Y satisfy

||v_j-v_j'||_2 <= min(sqrt{2}, (\epsilon\sqrt{1+\epsilon})/(\sqrt{1-\epsilon}) max_{i\neq j} (\sqrt{2}\sigma_i\sigma_j)/(min_{c\in[-1,1]}(|\sigma^2_i-\sigma^2_j(1+c\epsilon)|))) as compared to the right singular vectors v_j of X. We apply this result to obtain a streaming graph algorithm to approximate the eigenvalues and eigenvectors of the graph Laplacian in the case where the graph has low rank (many connected components).

Randomized Matrix Computations by Victor Y. Pan, Guoliang Qian, Ai-Long Zheng. The abstract reads:

Random matrices tend to be well conditioned, and we employ this well known property to advance matrix computations. We prove that our algorithms employing Gaussian random matrices are efficient, but in our tests the algorithms have consistently remained as powerful where we used sparse and structured random matrices, defined by much fewer random parameters. We numerically stabilize Gaussian elimination with no pivoting as well as block Gaussian elimination, precondition an ill conditioned linear system of equations, compute numerical rank of a matrix without orthogonalization and pivoting, approximate the singular spaces of an ill conditioned matrix associated with its largest and smallest singular values, and approximate this matrix with low-rank matrices, with applications to its 2-by-2 block triangulation and to tensor decomposition. Some of our results and techniques can be of independent interest, e.g., our estimates for the condition numbers of random Toeplitz and circulant matrices and our variations of the Sherman--Morrison--Woodbury formula.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Monday, November 05, 2012

Bayesian Blind Deconvolution With General Sparse Image Priors - implementation -

Here is nice algorithm implementation sent to me by Derin to start the week:

Bayesian Blind Deconvolution With General Sparse Image Priors by S. Derin Babacan, Rafael Molina, Minh N. Do, Aggelos K. Katsaggelos. The abstract reads:

We present a general method for blind image deconvolution using Bayesian inference with super-Gaussian sparse image priors. We consider a large family of priors suitable for modeling natural images, and develop the general procedure for estimating the unknown image and the blur. Our formulation includes a number of existing modeling and inference methods as special cases while providing additional flexibility in image modeling and algorithm design. We also present an analysis of the proposed inference compared to other methods and discuss its advantages. Theoretical and experimental results demonstrate that the proposed formulation is very effective, efficient, and flexible.

The supplementary technical note and implementation of the algorithm are available here.

Thanks Derin !

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Sunday, November 04, 2012

Sunday Morning Insights: Thinking in Exponential Times

We are not accustomed to Moore's law it seems. Here is the latest example: a safe DKIM now requires 2048 bit keys. You have to look at this incident in a different way really. The current infrastructure of the internet is vulnerable because people whose job and livelihood depends on mastering it, cannot think that Moore's law is having an effect on their day-to-day operations. But there is more impressive in terms of thinking in exponential times. We know that sequencing costs are decreasing at a faster pace than Moore's law.

This week came reports of a second person who, while not being an active genetics researcher, is also looking into exome sequencing to get to the bottom of a family members' disease. The first instance we heard of a similar undertaking was just a few month ago when Matt wrote about searching for his son's killer also using exome sequencing. If you recall the cost of exome sequencing then was about $4,000 to $5,000. Last year, 23andme had a program where you could get a peek at your exome for $999. And they are not using Nanopore technology ... yet.

Why am I writing all this ? There are indeed ample signs that these technologies are indeed maturing too fast with regards to our thought processes but the most important element to remind ourselves is that they change the playing field. What you once found through chance may be becoming a low hanging fruit with the use of simple explanatory modeling.

As Matt said, " And, there are certainly unknown unknowns beyond that."

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Saturday, November 03, 2012

DIY Tin Can Radar Systems Forum

Anytime, there is a sensor you can hack and maybe even put in a CS mode, I am a taker. I received the following from Greg Charvat, the person behind the Tin Can Radar system

Hi Igor,

Thought your readers might find this interesting.

After receiving many many e-mail questions about small radar systems i've setup a forum:

http://glcharvat.com/tincan/

To answer questions in the open so that everyone benefits. The forum is for those who are building small radar sensors and need help or who want to share their work (photos, blog posts, or papers) with the community.

Hope everything is well with you!

Greg

Thanks Greg !

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Friday, November 02, 2012

Reddit Experiment: A month later.

Our Reddit Experiment started a month ago. We have 43 readers (or subscribers of the subreddit). Not much engagement as far as posting or commenting but some of the readers are beginning to provide some Karma points to some of the entries. This is a good start as it provides a non sequential view of what we have seen on the blog so far. But entries are not restricted to the blog. The subject area considered ok include compressive sensing in a broad sense as well as anything pertaining to advanced matrix factorization. We welcome your contributions. On a tablet, we like to watch it with scrolldit.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Compressive Phase Retrieval via Generalized Approximate Message Passing - implementation -

Following yesterday's entry about CPRL, Phil Schniter sent me the following:

Hi Igor,
Since you seem to be blogging about phase-retrieval lately, your readers may be interested in our GAMP approach from Allerton 2012: http://www2.ece.ohio-state.edu/~schniter/pdf/all12_phase.pdf
Note that tables III-IV suggest that PR-GAMP has a much better phase transition than CPRL. we used the CVX implementation of CPRL that was available at the time we wrote the paper. Because it was slow, our experiments with CPRL were very limited. I plan to try the new ADMM implementation soon.
As for the theory of sparse phase-retrieval, your readers may be interest in the recent work http://arxiv.org/abs/1209.4785.
cheers, Phil

Thanks Phil

The paper is: Compressive Phase Retrieval via Generalized Approximate Message Passing by Philip Schniter and Sundeep Rangan. The abstarct reads:

In this paper, we propose a novel approach to compressive phase retrieval based on loopy belief propagation and, in particular, on the generalized approximate message passing (GAMP) algorithm. Numerical results show that the proposed PR-GAMP algorithm has excellent phase-transition behavior, noise robustness, and runtime. In particular, for successful recovery of synthetic Bernoulli-circular-Gaussian signals, PRGAMP requires ≈ 4 times the number of measurements as a phase-oracle version of GAMP and, at moderate to large SNR, the NMSE of PR-GAMP is only ≈ 3 dB worse than that of phase-oracle GAMP. A comparison to the recently proposed convex-relation approach known as “CPRL” reveals PR-GAMP’s superior phase transition and orders-of-magnitude faster runtimes, especially as the problem dimensions increase. When applied to the recovery of a 65k-pixel grayscale image from 32k randomly masked magnitude measurements, numerical results show a median PR-GAMP runtime of only 13.4 seconds.

Phil also tells me that PR-GAMP is part of the GAMP Matlab package.

The other theory paper I missed is: Sparse Signal Recovery from Quadratic Measurements via Convex Programming by Xiaodong Li, Vladislav Voroninski. The abstract reads:

In this paper we consider a system of quadratic equations |z_j, x|^2 = b_j, j = 1, ..., m, where x in R^n is unknown while normal random vectors z_j in R_n and quadratic measurements b_j in R are known. The system is assumed to be underdetermined, i.e., m \lt n. We prove that if there exists a sparse solution x, i.e., at most k components of x are non-zero, then by solving a convex optimization program, we can solve for x up to a multiplicative constant with high probability, provided that k \le O((m/log n)^(1/2)). On the other hand, we prove that k <= O(log n (m)^(1/2)) is necessary for a class of naive convex relaxations to be exact.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !

Thursday, November 01, 2012

CPRL – An Extension of Compressive Sensing to the Phase Retrieval Problem - implementation -

We covered it last year ( Compressive Phase Retrieval Implementation ) but it seems the new algorithm is now using ADMM. As mentioned in the last review, it looks like ADMM (see here for Matlab scripts for ADMM from S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein) is becoming indeed a mainstay.

CPRL – An Extension of Compressive Sensing to the Phase Retrieval Problem by Henrik Ohlsson, Allen Y. Yang, Roy Dong, S. Shankar Sastry.. The abstract reads:

While compressive sensing (CS) has been one of the most vibrant research ﬁelds in the past few years, most development only applies to linear models. This limits its application in many areas where CS could make a difference. This paper presents a novel extension of CS to the phase retrieval problem, where intensity measurements of a linear system are used to recover a complex sparse signal. We propose a novel solution using a lifting technique – CPRL, which relaxes the NP-hard problem to a nonsmooth semideﬁnite program. Our analysis shows that CPRL inherits many desirable properties from CS, such as guarantees for exact recovery. We further provide scalable numerical solvers to accelerate its implementation.