A Randomized Algorithm for CCA by Paul Mineiro, Nikos Karampatziakis
We present RandomizedCCA, a randomized algorithm for computing canonical analysis, suitable for large datasets stored either out of core or on a distributed file system. Accurate results can be obtained in as few as two data passes, which is relevant for distributed processing frameworks in which iteration is expensive (e.g., Hadoop). The strategy also provides an excellent initializer for standard iterative solutions.
h/t Olivier Grisel
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.