Wednesday, January 12, 2011

CS: " come your browser can't read JPEG-2000 ?.." , Q&As and some papers

The title of this entry is part of an exchange I had on reddit recently entitled Recent major results in signal processing research?. (see also here) From the number of points I am getting, it looks like the views I am expressing are not popular.  Oh well.

I am now on Quora. If you want an invite, let me know. I have noticed two questions listed there that have some bearing on some of the subjects mentioned on Nuit Blanche. The first one I am responding to is whether compressive sensing has peaked. I am still writing the draft. The other is about calibration
Has anybody tried to make a synthetic 3D movie datasets for training object recognition algorithms? I plan to experiment with a scriptable 3D & physics engine such as Panda3D to generate movies of 3D scenes from simple geometric moving objects (e.g. cubes, spheres, ...) to real-life looking scenes (e.g. city streets with buildings, animated cars and pedestrians) so as to prepare semantically annotated datasets for computer vision tasks and try to experiment with curriculum learning for deep semi-supervised learning algorithms. Is there any existing implementation of this approach?

Michael Elad made a presentation at the SMALL workshop entitled: Exploiting Statistical Dependencies in Sparse Representations (PDF). The introduction to the presentation reads:
In the commonly used sparse representation modeling, the atoms are assumed to be independent of each other when forming the signal. In this talk we shall introduce a statistical model called Boltzman Machine (BM) that enables such dependencies to be taken into account. Adopting a Bayesian point of view, we first treat the pursuit problem - given a signal, and assuming that the model parameters and the dictionary are known, find its sparse representation. We derive the exact MAP estimation, and show that just like in the independent case, this leads to an exponential search problem. We derive two algorithms for its evaluation: a greedy approximation approach for the general case, and an exact estimation that corresponds to a unitary dictionary and banded interaction matrix. We also consider the estimation of the model parameters, learning these parameters directly from training data. We show that given the signals' representations, this problem can be posed as a convex optimization task by using the Maximum Pseudo-Likelihood (MPL).
Compressed sensing is applied to multiview image sets and the high degree of correlation between views is exploited to enhance recovery performance over straightforward independent view recovery. This gain in performance is obtained by recovering the difference between a set of acquired measurements and the projection of a prediction of the signal they represent. The recovered difference is then added back to the prediction, and the prediction and recovery procedure is repeated in an iterated fashion for each of the views in the multiview image set. The recovered multiview image set is then used as an initialization to repeat the entire process again to form a multistage refinement. Experimental results reveal substantial performance gains from the multistage reconstruction.

Compressive Sampling Techniques for Integrating Video Acquisition and Coding by M.P Krishnan and K.R Rao. The abstract reads:
The Shannon/Nyquist sampling theorem specifies that for lossless capture of a signal, one must sample at least two times faster than the largest frequency content of the signal. In many applications, the Nyquist rate is so high that too many samples result, making compression a necessity prior to storage or transmission. But if the signal has a sparse representation in some domain, the novel idea of compressive sampling [9] could be used to exploit the sparsity and thus significantly reduce sampling rate of the signal. In this paper a simulation of an efficient system for integrating video coding/compression with acquisition through compressive sampling is developed in detail. Simulation is implemented for the Y-components of video frames and qualitative comparison is done with traditionally acquired frames.

Also behind a firewall: A novel and efficient mixed-signal compressed sensing for wide-band cognitive radio

Credit: maZm blog.


fake said...

What's the advantage of a synthetic video library VS taking a Kinect out into the world and sampling reality?

(see )

Igor said...


Good catch.

My first answer is really that you have more control over what you are using in the calibration process. In particular, with this kinect set-up, one is bound to have to introduce additional information through some type of labeling which involve slow and sometime faulty information transfer.



romeo said...

I would like to be invited to Quora, would you share? thanks.

Igor said...

sure but you need to send me an email first.