Ever wondered what Big Data LASSO perfs you can get on an EC2 with an ADMM algorithm ? Zhimin Peng, Ming Yan and Wotao Yin have an answer that can fit in the category faster than a blink of an eye.
Abstract—This paper proposes parallel and distributed algorithms for solving very large-scale sparse optimization problems on computer clusters and clouds. Modern datasets usually have a large number of features or training samples, and they are usually stored in a distributed manner. Motivated by the need of solving sparse optimization problems with large datasets, we propose two approaches including (i) distributed implementations of proxlinear algorithms and (ii) GRock, a parallel greedy coordinateblock descent method. Different separability properties of the objective terms in the problem enable different data-distributed schemes along with their corresponding algorithm implementations. We also establish the convergence of GRock and explain why it often performs exceptionally well for sparse optimization. Numerical results on a computer cluster and Amazon EC2 demonstrate the efficiency and elasticity of our algorithms.
Parallel and Distributed Sparse Optimization
Background
Modern datasets usually have a large number of features or training samples, and they are usually stored in a distributed manner. Motivated by the need of solving sparse optimization problems with large datasets, we propose two approaches including (i) distributed implementations of prox-linear algorithms and (ii) GRock, a parallel greedy coordinate descent method.
Links
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
No comments:
Post a Comment