Uh, an embarrasingly parallel way of sketching very large matrices in the streaming model, a second time,... wow!
Frequent Directions : Simple and Deterministic Matrix Sketching by Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff
We describe a new algorithm called Frequent Directions for deterministic matrix sketching in the row-updates model. The algorithm is presented an arbitrary input matrix A∈Rn×d one row at a time. It performed O(d×ℓ) operations per row and maintains a sketch matrix B∈Rℓ×d such that for any k<ℓ
∥ATA−BTB∥2≤∥A−Ak∥2F/(ℓ−k) and ∥A−πBk(A)∥2F≤(1+kℓ−k)∥A−Ak∥2F .
Here, Ak stands for the minimizer of ∥A−Ak∥F over all rank k matrices (similarly Bk) and πBk(A) is the rank k matrix resulting from projecting A on the row span of Bk. We show both of these bounds are the best possible for the space allowed. The summary is mergeable, and hence trivially parallelizable. Moreover, Frequent Directions outperforms exemplar implementations of existing streaming algorithms in the space-error tradeoff.
No comments:
Post a Comment