Friday, December 18, 2015

Hamming's Time: Making Hyperspectral Imaging Mainstream

Friday afternoon is Hamming's time. Today I decided to compete in the Best Camera Application contest of XIMEA, a maker of small hyperspectral cameras. Here is my entry:


Challenging task: Make hyperspectral imaging mainstream

Idea: Create a large database of hyperspectral imagery for use in Machine/Deep Learning Competitions



Background

Machine Learning is the field concerned with creating, training and using algorithms dedicated to making  sense of data. These algorithms are taking advantage of training data (images, videos) as a way of improving for tasks such as detection, classification, etc. In recent years, we have witnessed a spectacular growth in this field thanks to the joint availability of large datasets originating from the internet and the attendant curating/labeling efforts of said images and videos.

Numerous labeled datasets available such as CIFAR [1], Imagenet [2], etc. routinely permit algorithms of increased complexity to be developed and compete in state of the art classification contests. For instance, the rise of deep learning algorithms comes from breaking all the state-of-the-art classification results in the “ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry” [3] More  recent examples of this heated competition results were recently shown at the NIPS conference  last week where teams at Microsoft Research produced breakthroughs in classification with an astounding 152 layer neural networks [4]. This intense competition between highly capable teams at universities and large internet companies is only possible because some large amount of training data is being made available.

Image or even video processing for hyperspectral imagery cannot follow the development of image processing that occurred for the past 40 years. The underlying reason stems from the fact that this development was performed at considerable expense by companies and governments alike and eventually yielded standards such as Jpegs, gif, Jpeg2000, mpeg, etc…Because such funding is no longer available we need to find ways of improving and working with new imaging modalities.
Technically, since hyperspectral imagery is still a niche market, most analysis performed in this field runs the risk of being seen as an outgrowth of normal imagery: i.e substandards tools such as JPEG or labor intensive computer vision tools are being used to classify and use this imagery without much thought into using the additional structure of the spectrum information. More sophisticated tools such as advanced matrix factorization (NMF, PCA, Sparse PCA, Dictionary learning, ….) in turn focus on the spectral information but seldomly use the spatial information. Both approaches suffer from not investigating more fully the inherent robust structure of this imagery.  

For hyperspectral imagery to become mainstream, algorithms for compression and for its day-to-day use has to take advantage of the current very active and highly competitive development in Machine Learning algorithms. In short, creating large and rich hyperspectral imagery datasets beyond what is currently available ([5-8] is central for this technology to grow out its niche markets and become central in our everyday lives.



The proposal

In order to make hyperspectral imagery mainstream, I propose to use a XIMEA camera and shoot imagery and video of different objects, locations and label these datasets.

The datasets will then be made available on the internet for use by parties interested in performing classification competition based on them (Kaggle, academic competitions,...).

As a co-organizer of the meetup, I also intend on enlisting some of the folks in the Paris Machine Learning meetup group ( with close to 3000 members it is one of the largest Machine Learning meetup in the world [9]) to help in enriching this dataset.

The dataset should be available from servers probably colocated at a university or some non-profit organization (to be identified). A report presenting the dataset should be eventually academically citable.



References
[2] Imagenet dataset, http://www.image-net.org/
[3] ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
[8] Parraga CA, Brelstaff G, Troscianko T, Moorhead IR, Journal of the Optical Society of America 15 (3): 563-569, 1998 or G. Brelstaff, A. Párraga, T. Troscianko and D. Carr, SPIE. Vol. 2587. Geog. Inf. Sys. Photogram. and Geolog./Geophys. Remote Sensing, 150-159, 1995,


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

5 comments:

Karthikeyan N R said...

Good idea Igor! I have been fascinated by hyperspectral cameras and was wondering what if we can get a "google earth" with hyperspectral images.

Harrison Knoll said...

This is a great idea! We here at Aerial Agriculture have been collecting hyperspectral data and will be following your progress. Let us know if there is anything you need! ~Harrison

Alex St. John said...

Agreed, hyperspectral space is the place for next-level analysis!

Also from Aerial Agriculture here, and am interested to follow up on this kind of project and continue building.

Alex

Unknown said...

Great idea - Movidius would be very supportive

Harrison Knoll said...

@movidius reach out to me at Harrison@aerial.ag

Printfriendly