Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of data points. We provide a new analysis of a class of approximate kernel methods that have more modest memory requirements, and propose a specific one-pass randomized kernel approximation followed by standard K-means on the transformed data. The analysis and experiments suggest the method is accurate, while requiring drastically less memory than standard kernel K-means and significantly less memory than Nystrom based approximations.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.