Page Views on Nuit Blanche since July 2010


You can also join the Google+ Community (386), the CompressiveSensing subreddit (116), the LinkedIn Compressive Sensing group (2277) or the Matrix Factorization (661) and post there !
Reference pages include The Big Picture in Compressive Sensing, the Advanced Matrix Factorization Jungle Page and the Reproducible Research page

Tuesday, July 17, 2007

How does the Rice one pixel camera work ?

I have had this question asked several times by colleagues and friends, so I decided to use my talent in Microsoft Paint to explain it.

The problem: If you have only one sensitive pixel/photodetector/radiation detector/Teraherz detector and you want to take a 10 MP image, what do you do ? Let us also imagine you also have access to a chip built with 10 million mirrors (Texas Instrument DMD) and you can command each of these tiny 15 microns by 15 microns mirrors. Each can be switched to shine on your detector or not.

First option (the raster mode):
You switch all your mirrors but one away from your detector. You do that once
twice (with another pixel)

thrice (with yet another pixel)
four times (....)
five times (...)
....5 millions times (...)
until you reach the last 10 millionth pixel.
You now have ten million information which put together provides you with a 10 MP image. You then use a small processor to perform the Discrete Cosine Transform which eventually provides you with a JPEG image.

Second option (the Compressive Sensing mode):

You tell the DMD to display a set of random tilings of shining and none shining mirrors.

You do that once with an initial random tiling, this is your first CS measurement
then you do this again with a second random tiling,  this is your second CS measurement

then you do this again with a third random tiling,  this is your third CS measurement
and so on.

Compressed sensing tells you that with very high probability, you will get the same result as the raster mode  above but with much fewer CS measurements Instead of taking 10 million raster mode measurements , you are likely to need 20 percent CS measurements, maybe even less.

The idea is that most natural images are sparse in bases ranging from cosines, wavelets to curvelets (this is why JPEG does a tremendous job in decreasing the size of most images). Functions that represent random tiling of reflective and non reflective mirrors (0s and 1s) are incoherent with these basis.. Hence, the second method consists in taking projection of an image of interest on this random basis.


The advantages of the second option are:
  1. the overall sensor requires very low power because there is no CPU/GPU/FPGA trying to do compression like JPEG.
  2. the sensor is dedicated to acquiring information. Information processing can be done somewhere else ( think on some other planet)
  3. compared to raw images, the information to be transmitted is very much compressed albeit not optimally.
  4. you can spend all your money designing the very best sensitive pixel you want, it may even act as a spectrometer (looking at many different energy bands), radiation detector and so forth.
Terry Tao provides a much clearer explanation albeit with no images. All information about this system can be found at Rice.

7 comments:

Aleks said...

How would one be able to use this concept for compression: you would have to communicate the locations of those pixels too, which would take a lot of bits?

Second, why sample randomly, why not sample intelligently?

Igor said...

Aleks,

I guess your question is:
- The number of measurements is lower
- but you need to also transmit all the "masks" that produced the random measurement.
-> the total amount of information is much higher, because you have to send both the "masks" and the measurements.

Random needs to be taken in the following understanding:
- at the lab, you use your favorite random number generator and produce these masks ahead of time.
- you install those "masks" as tables/memory in the sensor
- when the sensor is in the field, it recalls these masks from its memory.
- when transmitting the compressed measurement somewhere, you can add a bit or two to let the receiver know which of the random mask was used.

So in effect, it is random, but it is not generated on the fly.

For the random intelligently, some people are doing it and the bayesian compressive sensing would fall in that category. However, in these types of system, doing any computation uses a lot of power and so the thinking is that if you want to have those for years, you do not want them to draw on the power side of things.

And then there is the kicker as mentioned by Richard Baraniuk and his team. Since you know these random measurements are pretty much universal, the information you are retrieving from them may not be the same as the ones you are retrieving from them ten years down the road where we will have much better reconstruction techniques. If you put intelligence in it today, you are most likely decreasing your ability to get anything interesting out these data in the future.

Igor.

Andrew Gelman said...

Thanks.

abhishek said...

Do we need a longer exposure time for these single pixel cameras? If 't' is the exposure time needed for traditional camera, then for single pixel camera do we need 'm*t' exposure time where 'm' is the number of distinct random masks?

Igor said...

Abhishek,

Yes, as far as I can tell you do need m*t exposure time.

Cheers,


Igor.

abhishek said...

Thanks for the answer. I have one more question.

As I understand, the random mask represents a projection matrix 'U' consisting of 0's and 1's, and image data 'x' (from spatial domain) is projected onto these bases.

Do we need that the data in original space should be sparse (i.e. 'x' should be sparse in the current representation)? or it can be sparse in any basis?

Another question is - suppose 'x' is not sparse in current representation. Do we need to design the random projection matrices so that they bring out the sparsity in data? or it can be any random matrix of 0's and 1's - which are ofcourse incoherent with cosine bases?

thanks a lot,
abhishek.

austin said...

I know this is a bit old, but I wanted to point out that a larger single pixel with back thinning, and an electron multiplying register could do well to improve single exposures, thereby reducing the acquisition time. - I have some descriptions of EM cameras at www.austinblanco.com/blog