Tuesday, September 26, 2017

Lazy stochastic principal component analysis / Near Optimal Sketching of Low-Rank Tensor Regression

Random projections at work today for evaluating PCA and Tensor Regression !

Stochastic principal component analysis (SPCA) has become a popular dimensionality reduction strategy for large, high-dimensional datasets. We derive a simplified algorithm, called Lazy SPCA, which has reduced computational complexity and is better suited for large-scale distributed computation. We prove that SPCA and Lazy SPCA find the same approximations to the principal subspace, and that the pairwise distances between samples in the lower-dimensional space is invariant to whether SPCA is executed lazily or not. Empirical studies find downstream predictive performance to be identical for both methods, and superior to random projections, across a range of predictive models (linear regression, logistic lasso, and random forests). In our largest experiment with 4.6 million samples, Lazy SPCA reduced 43.7 hours of computation to 9.9 hours. Overall, Lazy SPCA relies exclusively on matrix multiplications, besides an operation on a small square matrix whose size depends only on the target dimensionality.

Near Optimal Sketching of Low-Rank Tensor Regression by Jarvis Haupt, Xingguo Li, David P. Woodruff
We study the least squares regression problem
where SD,R is the set of Θ for which Θ=Rr=1θ(r)1θ(r)D for vectors θ(r)dRpd for all r[R] and d[D], and  denotes the outer product of vectors. That is, Θ is a low-dimensional, low-rank tensor. This is motivated by the fact that the number of parameters in Θ is only RDd=1pd, which is significantly smaller than the Dd=1pd number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors Θ, as well as the Tucker decomposition. For both models we show how to apply data dimensionality reduction techniques based on {\it sparse} random projections ΦRm×n, with mn, to reduce the problem to a much smaller problem minΘΦAΘΦb2, for which if Θ is a near-optimum to the smaller problem, then it is also a near optimum to the original problem. We obtain significantly smaller dimension and sparsity in Φ than is possible for ordinary least squares regression, and we also provide a number of numerical simulations supporting our theory.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

No comments: