Sunday, August 24, 2014

Is Deep Learning the Final Frontier and the End of Signal Processing ?

[Update: In response to this panel discussion, Yoshua Bengio provided his view on Deep Learning ]

There was a panel discussion at the Technion in June asking the provocative question: "Is Deep Learning the Final Frontier and the End of Signal Processing"

The panel was made up of the following smarts: Guillermo Sapiro (Moderator), Raphy Coifman, Nati Srebo, Stephane Mallat, Peyman Milanfar and Amnon Shashua,

I have attempted to summarize some of what was said (please let me know if I stated something inaccurate). I have also added my understanding of some of these issues at the bottom of this summary.

Raphy Coifman talked about how randomization has been successful in Compressive Sensing and how we eventually devised ways to deconstruct it. We should be doing the same for deep learning.

Stephane Mallat stated that there was something deep about memory (about its growth). A long time ago, much of the emphasis was about logic not about memory. Systems with large memory changes that paradigm,  there is something deep conceptually underneath.

Peyman Milanfar, who is interested in performance bounds made the point that one need to understand about these deep networks better (how much data is needed, etc ). That currently there no deep insight was gained from deep learning examples: more data does not translate into better results...or insight.

Amnon Shashua, stated that CNNs are fascinating because it is like going back to the stone age (convolution, etc...) but that it now works. This is shocking. Someone without domain expertise can get good results. We just replaced the nonlinearity and the regularization but everything is the same as in the 1980s. 

Guillermo wondered about fields where image is generally not abundant (example of the costly brain image acquisition by NIH), does deep network allow for smart interpolation ?

Raphy asked about learning physical models from examples, can deep learning allow one to learn more than a specific situation.

Peyman: How do we go from shallow to deep learning ?

Stephane: The reason these networks are deep is because of the factorization of an operator into smaller ones just like FFT or FWT

Mihail: Is there an equivalent in generative methods for deep learning ? deep generative methods ?

Raphy: Instead of being generative, it is probably a good way to build a memory of what we know.

Stephane: it is not just a storage of what we know however we also don't know the functional space that is approximated by these deep networks.

Other questions asked by people in the room yielded the following remarks:

Ammon: the unsupervised work (autoencoder) is no longer what makes the deep network successful since 2011. 

Guillermo: Another even more successful algorithm is Random Forest. Both algorithms ( random forest and deep learning ) are vying for displacing the other (instead of aiming for signal processing ).

Closing remarks

Peyman: what are the performance bounds for small amount of data, one of the interesting question is "can deep network do with one example ?" 

Nati: It is probably transformative

Raphy: can it help with one exemplar learning ?

Stephane: beautiful example of Engineering being at the forefront of mathematics, beautiful opportunities ahead.

The discussion was a very insightful summary of the lingering questions. Here are some remarks:

With regards to understanding the black box that is deep network, let us mention that there is some ongoing work in that respect (see Sparse Matrix Factorization: Simple rules for growing neural nets and Provable Bounds for Learning Some Deep Representations ). 

With linear encoding systems used in compressive sensing, we have gotten a deeper understanding of them though sharp phase transitions. Can a similar path be taken for understanding deep networks (see Sunday Morning Insight: Sharp Phase Transitions in Machine Learning ?Sunday Morning Insight: Randomization is not a dirty word).

When Stephane Mallat mentioned that FFT was already a deep network, maybe we should view of a whole slew of family of algorithms as deeper architectures and yes, that include iterative solvers (  see Sunday Morning Insight: Faster Than a Blink of an Eye ). The generative model question made me think of what we recently called Physics Driven Sensor Design. Obviously Matrix factorization could be part of that mix (see Sunday Morning Insight: Matrix Factorizations and the Grammar of Life ) but I would not be surprised if Random Forests did not make progress in that area since they tend to provide some insight and some explaining power.

While the panel featured some of the most important people in the field of signal processing and machine learning, I really felt that a hardware person was missing. Signal processing has always been on the tail end of the hardware pipeline and the issue of too few data in some fields is directly related to the cost of hardware. However, because there is a somewhat continuous spectrum between Machine Learning and traditional signal processing (see A Quick Panorama of Sensing from Direct Imaging to Machine Learning ), I think it is becoming clearer that we are in fact entering the era of Zero Knowledge Sensor Design / Data Driven Sensor Design ( see the recent Learning to be a Depth Camera for Close-Range Human Capture and Interaction ) especially since memory is no longer an issue (Being a child of Moore's Law). Time will tell.

No comments: