...Graham Cormode had a nice survey of data sketching algorithms, which he represented as a haiku that I unfortunately failed to copy down. But basically he's interested in methods that have some properties among: sublinear storage, randomized, linear, can be added together; they include methods such as Bloom filters, minhashing (for which he described a nice variation that estimates the number of distinct items in a set with good multiplicative error), and the count-min sketch (which gives an additive approximation to the frequency of each item in a data stream). Towards the end he listed situations in which these methods are actually used "in the field": he included web log analysis and compressed sensing, and explicitly excluded sensor networks because the theoretical applications of sketching in sensor networks as huge randomly placed networks with too much data to store do not match the scientific applications of at most a dozen or two carefully placed sensors from which one wants to collect all the raw data. He also listed quite a few potential applications, but I think he could have included a lot more real applications if he had included Bloom filters in his application section as he did in his historical survey....

**Igor Carron**

**LightOn || @Google Scholar || @LinkedIn ||@Twitter || @Webpage**

+

*Papers:*

**Approximating Kernels at the speed of Light**

**Imaging with Nature**

**LightOn****Our Newsletter ||Twitter || LinkedIn**

**Nuit Blanche**community@Google+(2813) || @Facebook (413) || @Reddit (2091)

Compressive Sensing @LinkedIn (3933)

Advanced Matrix Factorization @Linkedin (1299)

*Paris Machine Learning*( MLParis.org )@Meetup.com (6774 members) || @archives || @LinkedIn (1975) || @Google+(522) ||

@Facebook (388) || @Twitter (2312 followers)

The Big Picture in Compressive Sensing|| Learning Compressive Sensing ||

Advanced Matrix Factorization Jungle Page ||

*Reference pages*The Big Picture in Compressive Sensing|| Learning Compressive Sensing ||

Advanced Matrix Factorization Jungle Page ||

**Highly Technical Reference Pages - Aggregators**||

These Technologies Do Not Exist || CAI: Cable And Igor's Adventures in Matrix Factorization || Search ||These Technologies Do Not Exist || CAI: Cable And Igor's Adventures in Matrix Factorization || Search ||

**Reproducible Research page**

## Tuesday, May 17, 2011

### Algorithms In The Field (#8f)

From David Eppstein's summary of day one of the Algorithm In The Field NSF workshop:

Subscribe to:
Post Comments (Atom)

## No comments:

Post a Comment