Friday, October 14, 2016

MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization - implementation -

Ramki just sent me the following:
Dear Igor,

We have recently opensourced our distributed NMF Library at that implements different NMF algorithms such as Multiplicative Update(MU), Hierarchical Alternating Least Squares(HALS) and active set based ANLS/BPP on MPI and OpenMP. Some latest benchmarking of these algorithms on Oak Ridge Leadership Computing Facility (OLCF) super computers are presented at Kindly look at the paper for the topic modeling results of entire stack overflow’s 12Million questions and experiments with graph clustering of the biggest sparse matrix of size 118M from Florida Sparse Matrix Collection.

Can you please post this information in your community - matrix factorization jungle, compressed sensing google plus group and other interested communities. Let us know if you are looking for more information.

 Thanks Ramki ! Here is a presentation and the preprint Ramki mentions: MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization by Ramakrishnan Kannan, Grey Ballard, Haesun Park
Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that AWH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets.
The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms that iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans for few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: