Tuesday, December 27, 2011

Is there any Computer Science view on Compressive Sampling?

I answered that question on Quora. If somebody has a more intuitive answer for the computer science and engineering folks (not the TCS folks obviously), please give it on Quora or edit/upvote this one:

Let's use the concept of hash functions and see if that helps

In this instance, think of one key that is run through different hash functions yielding several hashes. The goal of a hash function is to among other things to reduce the size of the information from the initial key: each hash is smaller bit-wise compared to the initial key.

The hash functions used in Compressive Sensing are linear functionals of the key. Random numbers are used to produce these linear functionals.
It is generally difficult if not impossible to get the original key from one hash, but with  Compressive Sensing  it is possible to recover a key from the different hashes produces by several hash functions.
When you look at the details, you'll find that the number of hash functions needed to recover the initial key is small if the initial key has some regularity (it has a lot of the same character in it, for a vector representing that key we say that it is sparse, i.e it has a lot of zeros in it). 
The number of hashes (produced each with a different hash function) needed to recover the initial key (and never have a collision with another key) depends on the "regularity" of the key, not on the large dimensional space in which the key resides. 
This is why the field is called  Compressive Sensing , if you were to sense keys through different simple hash functions, then if the keys have some sorts of regularity (sparsity), the number of hash functions needed to recover these keys is proportional to the regularity of the key not the high dimensional space in which the key lives. The compression comes from the few number of hashes needed to determine uniquely a key.  Compressive Sensing also provides the machinery to recover that key from several hashes

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.


Anonymous said...

looks like Bloom filters?

Igor said...

Hello Anonymous,

from wikipedia (http://en.wikipedia.org/wiki/Bloom_filter )

"A Bloom filter, conceived by Burton Howard Bloom in 1970,[1] is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not; i.e. a query returns either "inside set (may be wrong)" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives."

I am not sure I see the direct connection. Maybe by using many bloom filters for every key ?

Yaroslav Bulatov said...

" if the keys have some sorts of regularity (sparsity)"...that's different from the number of non-zero bits in the key, right? BTW, this reminds me of a trick that's been coming up recently -- compute smaller representation with some random hashes, and do learning/inference on that instead of original signal.Vowpal Wabbit papers like this one have background -- http://hunch.net/~jl/projects/hash_reps/hash_kernels/hashkernel.pdf

Igor said...


Yes I switched sparsity with regularity for the purpose of the explanation, even through there has been some discussions of that in the community recently ( http://nuit-blanche.blogspot.com/2011/10/low-complexity-solver.html ).

With regards to the random projections, yes, CS uses those as well (but not exclusively) and it indeed permits one to perform detection directly in the new phase space (for instance compressive sensing hardware https://sites.google.com/site/igorcarron2/compressedsensinghardware directly acquire these compressed measurements ).

However CS has been able to do a few things on top of that, that were actually crucial:

- people in CS have shown that one could recover the initial key provided a series of hash functions and their attendant hashes.

- Because of the sparsity feature of the initial key, we roughly know how many of the few hash functions and their attendant hashes are needed to reconstruct the initial key. It is an on-going debate and some recent results are showing we can go with a lower number (http://nuit-blanche.blogspot.com/2011/09/short-summary.html).

The field is still very much trying to sort some of the issues. We can do all kinds of nice things in the compressed space like in Wowpal, but in CS we are trying to figure out exactly how many of it is necessary and eventually implementable in hardware (sensors).

Does it clear it up (in the affrimative, I'll add it to Quora).