Nuit Blanche: Matrix Factorization with Binary Components

Tuesday, January 28, 2014

Matrix Factorization with Binary Components

So the Arora et al paper has really provoked a tsunami in the NMF world probably because like Suresh says, "One can think of the NMF as an l_1 variant of the SVD”. Another attractiveness of NMF was its easy, albeit very slow implementation. Let us say that currently we don't much implementation made available for the new crop of solvers. It's just a question of time I am sure ....

Matrix factorization with Binary Components by Martin Slawski, Matthias Hein, Pavlo Lutsik

Motivated by an application in computational biology, we consider low-rank matrix factorization with {0,1}-constraints on one of the factors and optionally convex constraints on the second one. In addition to the non-convexity shared with other matrix factorization schemes, our problem is further complicated by a combinatorial constraint set of size 2m⋅r, where mis the dimension of the data points and r the rank of the factorization. Despite apparent intractability, we provide - in the line of recent work on non-negative matrix factorization by Arora et al. (2012) - an algorithm that provably recovers the underlying factorization in the exact case with O(mr2r+mnr+r2n) operations for n datapoints. To obtain this result, we use theory around the Littlewood-Offord lemma from combinatorics.

New developments in nonnegative matrix factorization by Suresh Venkatasubramanian

Computing a Nonnegative Matrix Factorization -- Provably by Sanjeev Arora, Rong Ge, Ravi Kannan, Ankur Moitra

In the Nonnegative Matrix Factorization (NMF) problem we are given an n×m nonnegative matrix M and an integer r superior to 0. Our goal is to expressM as AW where A and W are nonnegative matrices of size n×r and r×m respectively. In some applications, it makes sense to ask instead for the product AW to approximate M -- i.e. (approximately) minimize$\norm{M - AW}_F$ where $\norm{}_F$ denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where A and W are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time:

1. We give a polynomial-time algorithm for exact and approximate NMF for every constant r. Indeed NMF is most interesting in applications precisely when r is small.

2. We complement this with a hardness result, that if exact NMF can be solved in time (nm)o(r), 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm.

3. We give an algorithm that runs in time polynomial in n, m and r under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings.

To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.