Nuit Blanche: Matrix Factorization This Week.

Friday, September 02, 2011

Matrix Factorization This Week.

During the Google Summer of Code 2011, two students integrated some matrix factorization algorithms as packages:

Marinka Zitnik implemented several matrix factorization techniques within the MF package of Orange
Vlad Niculae implemented also some Dictionary learning algorithms within scikits.learn,

Maybe, there ought to be a similar push next year for the more advanced matrix factorization algorithms. Python and R would seem to be good target languages.

Without further due, here are few papers, poster and codes that showed up on radar screen with some surprises (I'll talk about them in a different entry).

Solving Principal Component Pursuit in Linear Time via $l_1$ Filtering by Risheng Liu, Zhouchen Lin, Siming Wei, Zhixun Su. The abstract reads:

Recently, recovering a low rank matrix from sparsely corrupted data, which is known as Robust Principal Component Analysis (RPCA), has attracted tremendous interest and also found many applications in computer vision society. The solution to RPCA can be exactly found by Principal Component Pursuit (PCP), i.e., minimizing a combination of nuclear norm and $l_1$ norm. The existing methods for solving PCP all require singular value decompositions (SVD) of the data matrix, resulting in a high computational complexity, hence preventing the applications of RPCA to very large scale problems. In this paper, we propose a novel algorithm, called $l_1$ filtering, for \emph{exactly} solving PCP with an $O(r^2(m+n))$ complexity, where $m\times n$ is the size of data matrix and $r$ is the rank of the matrix to recover, which is supposed to be much smaller than $m$ and $n$. Moreover, $l_1$ filtering is \emph{highly parallelizable}. It is the first algorithm that can \emph{exactly} solve a nuclear norm minimization problem in \emph{linear time} (with respect to the data size). With $l_1$ filtering, applications of RPCA to very large scale problems becomes possible. Numerical experiments testify to the great advantage of $l_1$ filtering in speed over state of the art algorithms for solving PCP.

Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization by Nicolas Gillis, François Glineur. The abstract reads:

Nonnegative matrix factorization (NMF) is a data analysis technique used in a great variety of applications such as text mining, image processing, hyperspectral data analysis, computational biology, and clustering. In this paper, we consider two well-known algorithms designed to solve NMF problems, namely the multiplicative updates of Lee and Seung and the hierarchical alternating least squares of Cichocki et al. We propose a simple way to significantly accelerate their convergence, based on a careful analysis of the computational cost needed at each iteration. This acceleration technique can also be applied to other algorithms, which we illustrate on the projected gradient method of Lin. The efficiency of the accelerated algorithms is empirically demonstrated on image and text datasets, and compares favorably with a state-of-the-art alternating nonnegative least squares algorithm. Finally, we provide a theoretical argument based on the properties of NMF and its solutions that explains in particular the very good performance of HALS and its accelerated version observed in our numerical experiments.

The code to reproduce the data of this paper can be found here and has been added to the Matrix Factorization Jungle.

Two proposals for robust PCA using semidefinite programming by Michael McCoy and Joel Tropp. The abstract reads:

The performance of principal component analysis suffers badly in the presence of outliers. This paper proposes two novel approaches for robust principal component analysis based on semidenite programming. The rst method, maximum mean absolute deviation rounding, seeks directions of large spread in the data while damping the eect of outliers. The second method produces a low-leverage decomposition of the data that attempts to form a low-rank model for the data by separating out corrupted observations. This paper also presents ecient computational methods for solving these semidenite programs. Numerical experiments conrm the value of these new techniques

The code for these proposals for Robust PCA are here and have been added to the Matrix Factorization Jungle. The slides are here. In the slides there are the names of Gilles Pisier and Grothendieck showing up. This is interesting.

Improved analysis of the subsampled randomized Hadamard transform by Joel Tropp. The abstract reads:

This paper presents an improved analysis of a structured dimension-reduction map called the subsampled randomized Hadamard transform. This argument demonstrates that the map preserves the Euclidean geometry of an entire subspace of vectors. The new proof is much simpler than previous approaches, and it oers|for the first time|optimal constants in the estimate on the number of dimensions required for the embedding.

Large-scale PCA with sparsity constraints by Clement J. Probel and Joel Tropp. The abstract reads:

This paper describes a new thresholding technique for constructing sparse principal components. Large-scale implementation issues are addressed, and a mathematical analysis describes situations where the algorithm is effective. In experiments, this method compares favorably with more sophisticated algorithms.

Learning Sparse Codes for Hyperspectral Images by Adam Charles, Bruno Olshausen, Christopher J. Rozell

Fast and Simple Iterative Algorithm of Lp-norm Minimization for Under-determined Speech Separation by Yasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno. The abstract reads:

This paper presents an efﬁcient algorithm to solve Lp-norm minimization problem for under-determined speech separation; that is, for the case that there are more sound sources than microphones. We employ an auxiliary function method in order to derive update rules under the assumption that the amplitude of each sound source follows generalized Gaussian distribution. Experiments reveal that our method solves the L1-norm minimization problem ten times faster than a general solver, and also solves Lp-norm minimization problem efﬁciently, especially when the parameter p is small; when p is not more than 0.7, it runs in real-time without loss of separation quality