Following up on Tuesday's related development (53 pages of good stuff), here is some more on the use of randomized projection for preconditionning. Proposition 13 is interesting: Weighted SGD for ℓp Regression with Randomized Preconditioning by Jiyan Yang, Yin-Lam Chow, Christopher Ré, Michael W. Mahoney
In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger performance guarantees but are applicable to a narrower class of problems. In this paper, we aim to bridge the gap between these two methods in solving overdetermined linear regression problems---e.g., ℓ2 and ℓ1 regression problems. We propose a hybrid algorithm that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system. We prove that this algorithm inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. The effectiveness of such algorithms is illustrated numerically, and the results are consistent with our theoretical findings. Finally, we also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.