Monday, April 16, 2012

Matrix Factorization in the News

In a tweet, Tim Davies wonders if our discovery of using a simple webcam and Robust PCA as a some sort of radiation detector and multichannel analyzer would not be a good thing as an app. The answer is that CCDs and CMOSs are a very inefficient way of detecting radiation. We use these videos as toy examples because they are taken in high radiation areas. Unless you know exactly the chip that witnesses the scene, the lack of calibration stops you from exactly figuring out what you see. But if you are looking for a cheap gamma spectrometer, you may want to invest $250 in this one.

In a different direction  Nelson Ray at MetaMarkets explains what Robust PCA is in: Algorithmic Trendspotting and the Meaning of "Interesting".

Finally, we learn that Netflix never used its $1 million algorithm due to engineering costs. The attendant Netflix blog entry: Netflix Recommendations: Beyond the 5 stars (Part 1) features the following more accurate statement:

A year into the competition, the Korbell team won the first Progress Prize with an 8.43% improvement. They reported more than 2000 hours of work in order to come up with the final combination of 107 algorithms that gave them this prize. And, they gave us the source code. We looked at the two underlying algorithms with the best performance in the ensemble: Matrix Factorization (which the community generally called SVD,Singular Value Decomposition) and Restricted Boltzmann Machines (RBM). SVD by itself provided a 0.8914 RMSE, while RBM alone provided a competitive but slightly worse 0.8990 RMSE. A linear blend of these two reduced the error to 0.88. To put these algorithms to use, we had to work to overcome some limitations, for instance that they were built to handle 100 million ratings, instead of the more than 5 billion that we have, and that they were not built to adapt as members added more ratings. But once we overcame those challenges, we put the two algorithms into production, where they are still used as part of our recommendation engine........If you followed the Prize competition, you might be wondering what happened with the final Grand Prize ensemble that won the $1M two years later. This is a truly impressive compilation and culmination of years of work, blending hundreds of predictive models to finally cross the finish line. We evaluated some of the new methods offline but the additional accuracy gains that we measured did not seem to justify the engineering effort needed to bring them into a production environment. Also, our focus on improving Netflix personalization had shifted to the next level by then. In the remainder of this post we will explain how and why it has shifted.

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: