Sunday, March 07, 2010

Surely You Must Be Joking Mr. Screenwriter

An anonymous commenter mentioned the following as a result of the Why Compressed Sensing is NOT a CSI "Enhance" technology ... yet ! entry:

Funny coincidence... your entry is titled "Why Compressed Sensing is NOT a CSI "Enhance" technology ... yet ! " but in the video you posted, there is a clip from the show Numb3rs where they use compressed sensing to solve a crime. Specifically, the clip (around 1:20) refers to the "Pradeep Sen method" and later picks up on that at 1:28. The full text from the show is:

"Maybe we can use the 'Pradeep Sen' method to see into the windows. He has taken illumination algorithms to the next level and I can use them to enhance this photograph."

The work they are referring to is Sen et al.'s dual photography (Siggraph05) and later work on compressive dual photography where they use compressed sensing to do single pixel imaging and novel illumination. So they ARE using some REAL CS algorithms to solve a virtual crime in shows like CSI!

Who knows, maybe Romberg told them about it, since you said he was an advisor to that show...

I saw the episode on TV, however, and the whole thing was ridiculously unfeasible, but at least they ARE citing current work in compressed sensing... :)


For fairness purposes, Justin Romberg may or may not have been the source of this reference but the most likely explanation is that Siggraph was in Tinseltown in 2005 when this paper was presented. Also, there is a difference between being a science adviser and being heard, those are two different things... especially in lala land.

I then went to Pradeep Sen's website (the old one) and sure enough the video of this passage was mentioned and you can even download it. I just put it on Youtube as this discussion has to fall under fair use (if not, somebody needs to tell me what fair use is). Here it is for your enjoyment:




Anyway, the script is what the commenter said but I am dumbfounded about the relationship between what we see and what the actor says. I mean back in 2005 when this episode came up, Pradeep and others had indeed come up their wonderful work on "Dual Photography" but one wonders how this is related to the shot on the screen. Let us look at the abstract:

We present a novel photographic technique called dual photography, which exploits Helmholtz reciprocity to interchange the lights and cameras in a scene. With a video projector providing structured illumination, reciprocity permits us to generate pictures from the viewpoint of the projector, even though no camera was present at that location. The technique is completely image-based, requiring no knowledge of scene geometry or surface properties, and by its nature automatically includes all transport paths, including shadows, inter-reflections and caustics. In its simplest form, the technique can be used to take photographs without a camera; we demonstrate this by capturing a photograph using a projector and a photo-resistor. If the photo-resistor is replaced by a camera, we can produce a 4D dataset that allows for relighting with 2D incident illumination. Using an array of cameras we can produce a 6D slice of the 8D reflectance field that allows for relighting with arbitrary light fields. Since an array of cameras can operate in parallel without interference, whereas an array of light sources cannot, dual photography is fundamentally a more efficient way to capture such a 6D dataset than a system based on multiple projectors and one camera. As an example, we show how dual photography can be used to capture and relight scenes.

First, the paper is about structured illumination and eventually an adaptive scheme so that one can reduce the number of measurements to figure out the illumination in the scene (by the way it is a "simple" linear transport problem). In figure 4, one can see the elements for an illumination based one pixel camera that unlike others, could figure out the 3-dimension of the scene. In other words, it is a complex equivalent to the MMA17 black box discovery exercise. The approach that could make it a compressed sensing one is if the illumination were to be sampled from a random distribution a la single pixel camera at Rice. No fear though, Pradeep and Soheil Darabi then went on and did just that recently in "Compressive Rendering: A Rendering Application of Compressed Sensing" and "Compressive Dual Photography" that I will cover this coming week. My point is, not only Numb3rs does not even get it close to right, there is only one illumination shot but where are 599 others needed to produce a 6-D solution ? but this is not even a compressed sensing paper. Please recall that the single pixel camera at Rice can be used in a raster mode and in that case, it becomes a cute but inefficient normal camera. The big thing about the Rice camera is the number of illuminations not the fact that it has only one pixel, a point that seems to have been lost on many in the news.


Coming back to the video that started this discussion and why it is insidious to let misunderstanding go unchecked.


One can read the following recent comment by smidirini:

I used to laugh heartily at this kind of thing in movies and on TV. Now I read this article that says it is sorta kinda possible for some images.

See Wired magazine online, Feb 2010, "Fill in the Blanks: Using Math to Turn Lo-Res Datasets Into Hi-Res Samples."


Another commenter ( PopotheBadNewsBear ) responded on a different thread with:

No. It actually cannot. You can't bring pixels out of a picture where they don't exist.

You clearly missed the point of this video...


Amen Brother/Sister!

2 comments:

Unknown said...

Perhaps it would help if someone could explain just what exactly was being illustrated in the "Obama" series in the Wired article. It's hard to look at that and not come to the conclusion that pixels are being brought out where they don't exist.

In your side-by-side series (figure 10 from the previous post) the top row clearly shows that CS techniques do nothing useful for a standard image from which 90% of the pixels have been removed. The bottom row is obviously something different as the original data from the random lens camera doesn't resemble much of anything but can produce a close match for the picture even with 90% of the data gone.

The "Obama" series from the Wired article sure looks a lot more like your top series than your bottom one. The leftmost "Obama" looks like a normally captured picture with most of the pixels missing. It is clearly not the raw data captured by a random lens camera or single pixel camera. So is this series just a bad attempt to illustrate the concept or is there something else being illustrated that I just didn't understand?

BTW, thanks for clearing this up before I went of and wasted a month or two trying to write my own "Enhance...Enhance...Enhance" software.

Igor said...

Owen,

You said:

"...In your side-by-side series (figure 10 from the previous post) the top row clearly shows that CS techniques do nothing useful for a standard image from which 90% of the pixels have been removed. The bottom row is obviously something different as the original data from the random lens camera doesn't resemble much of anything but can produce a close match for the picture even with 90% of the data gone..."


The bottom row is the result of looking at that arm but with the random lens. It does not look like anything and it is not supposed to. Once you have gotten that "image", you generally have to decode it to get back the original image. What the examples show is that the information needed to reconstruct the original image is repeated several times in that "image" so that even if you remove 90% of that, you can reconstruct it back to the original picture. However, the "normal" picture does not have much redundancy built into them so that as soon as you loose some information (pixels) from that image, you cannot reconstruct it because it is lost forever.

You also said:
"The "Obama" series from the Wired article sure looks a lot more like your top series than your bottom one. The leftmost "Obama" looks like a normally captured picture with most of the pixels missing. It is clearly not the raw data captured by a random lens camera or single pixel camera. So is this series just a bad attempt to illustrate the concept or is there something else being illustrated that I just didn't understand?"

I think the former is a better summary than the latter :-)

Hopes this helps,

Igor.

Printfriendly