Tuesday, November 06, 2012

Sketched SVD: Recovering Spectral Features from Compressive Measurements - implementation - and Randomized Matrix Computations

Fitting into the generic RandNLA movement ( I just created a tag for it), here are two papers. The first one is so simple I even hesitate to even consider it as a defacto implementation but it definitely as a similar flavor as to the previously mentioned Invariance of Principal Components under Low-Dimensional Random Projection of the Data

We consider a streaming data model in which n sensors observe individual streams of data, presented in a turnstile model. Our goal is to analyze the singular value decomposition (SVD) of the matrix of data defined implicitly by the stream of updates. Each column i of the data matrix is given by the stream of updates seen at sensor i. Our approach is to sketch each column of the matrix, forming a "sketch matrix" Y, and then to compute the SVD of the sketch matrix. We show that the singular values and right singular vectors of Y are close to those of X, with small relative error. We also believe that this bound is of independent interest in non-streaming and non-distributed data collection settings. 
Assuming that the data matrix X is of size Nxn, then with m linear measurements of each column of X, we obtain a smaller matrix Y with dimensions mxn. If m = O(k \epsilon^{-2} (log(1/\epsilon) + log(1/\delta)), where k denotes the rank of X, then with probability at least 1-\delta, the singular values \sigma'_j of Y satisfy the following relative error result 
(1-\epsilon)^(1/2)<= \sigma'_j/\sigma_j <= (1 + \epsilon)^(1/2) as compared to the singular values \sigma_j of the original matrix X. Furthermore, the right singular vectors v'_j of Y satisfy 
||v_j-v_j'||_2 <= min(sqrt{2}, (\epsilon\sqrt{1+\epsilon})/(\sqrt{1-\epsilon}) max_{i\neq j} (\sqrt{2}\sigma_i\sigma_j)/(min_{c\in[-1,1]}(|\sigma^2_i-\sigma^2_j(1+c\epsilon)|))) as compared to the right singular vectors v_j of X. We apply this result to obtain a streaming graph algorithm to approximate the eigenvalues and eigenvectors of the graph Laplacian in the case where the graph has low rank (many connected components).

Randomized Matrix Computations by Victor Y. Pan, Guoliang Qian, Ai-Long Zheng. The abstract reads:
Random matrices tend to be well conditioned, and we employ this well known property to advance matrix computations. We prove that our algorithms employing Gaussian random matrices are efficient, but in our tests the algorithms have consistently remained as powerful where we used sparse and structured random matrices, defined by much fewer random parameters. We numerically stabilize Gaussian elimination with no pivoting as well as block Gaussian elimination, precondition an ill conditioned linear system of equations, compute numerical rank of a matrix without orthogonalization and pivoting, approximate the singular spaces of an ill conditioned matrix associated with its largest and smallest singular values, and approximate this matrix with low-rank matrices, with applications to its 2-by-2 block triangulation and to tensor decomposition. Some of our results and techniques can be of independent interest, e.g., our estimates for the condition numbers of random Toeplitz and circulant matrices and our variations of the Sherman--Morrison--Woodbury formula.

Join our Reddit Experiment, Join the CompressiveSensing subreddit and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: