Uh, an embarrasingly parallel way of sketching very large matrices in the streaming model, a second time,... wow!
Frequent Directions : Simple and Deterministic Matrix Sketching by Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff
We describe a new algorithm called Frequent Directions for deterministic matrix sketching in the row-updates model. The algorithm is presented an arbitrary input matrix A∈Rn×d one row at a time. It performed O(d×ℓ) operations per row and maintains a sketch matrix B∈Rℓ×d such that for any k<ℓ
∥ATA−BTB∥2≤∥A−Ak∥2F/(ℓ−k) and ∥A−πBk(A)∥2F≤(1+kℓ−k)∥A−Ak∥2F .
Here, Ak stands for the minimizer of ∥A−Ak∥F over all rank k matrices (similarly Bk) and πBk(A) is the rank k matrix resulting from projecting A on the row span of Bk. We show both of these bounds are the best possible for the space allowed. The summary is mergeable, and hence trivially parallelizable. Moreover, Frequent Directions outperforms exemplar implementations of existing streaming algorithms in the space-error tradeoff.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.