Deeper kernels can now use tricks from deep learning with dropout regularization. From the attendant slides:
Avoiding pathologies in very deep networks by David Duvenaud, Oren Rippel, Ryan Adams, Zoubin Ghahramani
Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Speciﬁcally, we study the deep Gaussian process, a type of inﬁnitely-wide,deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing inﬁnitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.The attendant code is here.
Potentially interesting reading:
- How Close is Compressive Sensing to Random Features with Random Kitchen Sinks?
- The Summer of the Deeper Kernels
- Structure Discovery in Nonparametric Regression through Compositional Kernel Search - implementation -
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.