True BLAS-3 Performance QRCP using Random Sampling by Jed A. Duersch, Ming Gu
The dominant contribution to communication complexity in factorizing a matrix using QR with column pivoting is due to column-norm updates that are required to process pivot decisions. We use randomized sampling to approximate this process which dramatically reduces communication in column selection. We also introduce a sample update formula to reduce the cost of sampling trailing matrices. Using our column selection mechanism we observe results that are comparable to those obtained from the QRCP algorithm, but with performance near unpivoted QR. We also demonstrate strong parallel scalability on shared memory multiple core systems using an implementation in Fortran with OpenMP.
This work immediately extends to produce low-rank truncated approximations of large matrices. We propose a truncated QR factorization with column pivoting that avoids trailing matrix updates which are used in current implementations of BLAS-3 QR and QRCP. Provided the truncation rank is small, avoiding trailing matrix updates reduces approximation time by nearly half. By using these techniques and employing a variation on Stewart's QLP algorithm, we develop an approximate truncated SVD that runs nearly as fast as truncated QR.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.