Friday, June 27, 2014

Impression Store: Compressive Sensing-based Storage for Big Data Analytics

Here is a related use of compressive sensing: simple storage for Ads. Overall, I am very surprised that none of the references include any of the sketching contribution to compressive sensing [1,2]. In retrospect, it was wise for us to invite Muthu and Sam at the last Paris Machine Learning Meetup [3] so that these techniques don't get re-discovered everytime.

For many big data analytics workloads, approximate results suffice. This begs the question, whether and howthe underlying system architecture can take advantage of such relaxations, thereby lifting constraints inherent intoday’s architectures. This position paper explores one of the possible directions. Impression Store is a distributed storage system with the abstraction of big data vectors. It aggregates updates internally and responds tothe retrieval of top-K high-value entries. With proper extension, Impression Store supports various aggregations,top-K queries, outlier and major mode detection. While restricted in scope, such queries represent a substantial and important portion of many production workloads. In return, the system has unparalleled scalability; anynode in the system can process any query, both readsand updates. The key technique we leverage is compressive sensing, a technique that substantially reduces theamount of active memory state, IO, and traffic volume needed to achieve such scalability

[1] Data Stream Algorithms Notes from a series of lectures by S. Muthu Muthukrishnan

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: