Ever wondered what Big Data LASSO perfs you can get on an EC2 with an ADMM algorithm ? Zhimin Peng, Ming Yan and Wotao Yin have an answer that can fit in the category faster than a blink of an eye.
Abstract—This paper proposes parallel and distributed algorithms for solving very large-scale sparse optimization problems on computer clusters and clouds. Modern datasets usually have a large number of features or training samples, and they are usually stored in a distributed manner. Motivated by the need of solving sparse optimization problems with large datasets, we propose two approaches including (i) distributed implementations of proxlinear algorithms and (ii) GRock, a parallel greedy coordinateblock descent method. Different separability properties of the objective terms in the problem enable different data-distributed schemes along with their corresponding algorithm implementations. We also establish the convergence of GRock and explain why it often performs exceptionally well for sparse optimization. Numerical results on a computer cluster and Amazon EC2 demonstrate the efficiency and elasticity of our algorithms.
Parallel and Distributed Sparse Optimization
Background
Modern datasets usually have a large number of features or training samples, and they are usually stored in a distributed manner. Motivated by the need of solving sparse optimization problems with large datasets, we propose two approaches including (i) distributed implementations of prox-linear algorithms and (ii) GRock, a parallel greedy coordinate descent method.
Links
No comments:
Post a Comment