The issue of sparse coding is not new to our readers since it falls in the dictionary learning issue of interest in the reconstruction stage in compressive sensing. Thanks to Bob, I was reminded of the following video of Andrew Ng about unsupervised learning (one way of performing dictionary learning). As Andrew shows some of these techniques are getting better and better.
By necessity, people are using the same databases. But if we take the argument further, some of these databases are somehow dependent on the very camera parameters that took these shots and one could make the argument that the calibration issue is hidden behind some well thought out benchmark databases. While one of the need for decomposing images is important, eventually, we are often not knowledgeable about the features that were learned (an issue Andrew points to at the very end of his video/slides.) Given all this, I wonder if exotic camera systems such as multiple Spherical Catadioptric Cameras ( as featured in Axial-Cones: Modeling Spherical Catadioptric Cameras for Wide-Angle Light Field Rendering by Yuichi Taguchi, Amit Agrawal, Ashok Veeraraghavan, Srikumar Ramalingam and Ramesh Raskar) or the random lens imager that produce highly redundant information of the scene of interest, can provide better classification rate than the current techniques ?
If we take this argument further, what would then be a good way of comparing these exotic camera systems for classification purposes ?...
For those of you interested: At the end of the Andrew Ng's presentation, there is a link to this course and handouts of the corresponding course at Stanford. It features two videos.