Nuit Blanche: MetaLearning

Showing posts with label MetaLearning. Show all posts

Tuesday, May 02, 2017

Designing Neural Network Architectures using Reinforcement Learning

At ICLR, I noted these figures below that tells the story of the need for certain operations in neural networks as a function of their depth location:

Designing Neural Network Architectures using Reinforcement Learning by Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar

At present, designing convolutional neural network (CNN) architectures requires both human expertise and labor. New architectures are handcrafted by careful experimentation or modified from a handful of existing networks. We introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose CNN layers using Q-learning with an ϵ-greedy exploration strategy and experience replay. The agent explores a large but finite space of possible architectures and iteratively discovers designs with improved performance on the learning task. On image classification benchmarks, the agent-designed networks (consisting of only standard convolution, pooling, and fully-connected layers) beat existing networks designed with the same layer types and are competitive against the state-of-the-art methods that use more complex layer types. We also outperform existing meta-modeling approaches for network design on image classification tasks.

Models found by MetaQNN are located here: https://bowenbaker.github.io/metaqnn/

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, May 01, 2017

Online Natural Gradient as a Kalman Filter

So Kalman Filters can help in hyperparameter search in Recurrent Learning.

Online Natural Gradient as a Kalman Filter by Yann Ollivier

We establish a full relationship between Kalman filtering and Amari's natural gradient in statistical learning. Namely, using an online natural gradient descent on data log-likelihood to evaluate the parameter of a probabilistic model from a series of observations, is exactly equivalent to using an extended Kalman filter to estimate the parameter (assumed to have constant dynamics).
In the i.i.d. case, this relation is a consequence of the "information filter" phrasing of the extended Kalman filter. In the recurrent (state space, non-i.i.d.) case, we prove that the joint Kalman filter over states and parameters is a natural gradient on top of real-time recurrent learning (RTRL), a classical algorithm to train recurrent models.
This exact algebraic correspondence provides relevant settings for natural gradient hyperparameters such as learning rates or initialization and regularization of the Fisher information matrix.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, April 29, 2017

Saturday Morning Videos: #ICLR2017 videos

Here are the videos of this year's ICLR2017 meeting.

Monday April 24, 2017

Morning Session – Session Chair: Dhruv Batra

Opening remarks, video starts at 12:15

9.00 - 9.40 Invited talk 1: Eero Simoncelli Elucidating and testing hierarchical sensory models through synthesis, Video starts at 26:00

9.40 - 10.00 Contributed talk 1: End-to-end Optimized Image Compression Video starts at 1:08:00

10.00 - 10.20 Contributed talk 2: Amortised MAP Inference for Image Super-resolution Video starts at 1:26:30

10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)

h/t Libby for catching this one. From this paper.

Afternoon Session – Session Chair: Joan Bruna (sponsored by Baidu)

14.30 - 15.10 Invited talk 2: Benjamin Recht What can Deep Learning learn from linear regression Video starts at 18:30

15.10 - 15.30 Contributed Talk 3: Understanding deep learning requires rethinking generalization - BEST PAPER AWARD, Video starts at 53:30

15.30 - 15.50 Contributed Talk 4: Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima Video starts at 1;12:30

15.50 - 16.10 Contributed Talk 5: Towards Principled Methods for Training Generative Adversarial Networks Video starts at 1;30:00

16.30 - 18.30 Poster Session 2 (Conference Papers, Workshop Papers)

Tuesday April 25, 2017

Morning Session – Session Chair: Tara Sainath (sponsored by Google)

9.00 - 9.40 Invited talk 1: Chloé-Agathe Azencott High dimensional feature selection in precision medicine Video starts at 13;24
9.40 - 10.00 Contributed talk 1: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data - BEST PAPER AWARD ,Video starts at 53:10
10.00 - 10.20 Contributed talk 2: Learning Graphical State Transitions Video starts at 1;14;50
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)

Afternoon Session – Session Chair: Raia Hasdell (sponsored by Amazon)

14.00 - 16.00 Poster Session 2 (Conference Papers, Workshop Papers)
16.15 - 17.00 Invited talk 2: Riccardo Zecchina Video starts at 7:05
17.00 - 17.20 Contributed Talk 3: Learning to Act by Predicting the Future Video starts at 53:50
17.20 - 17.40 Contributed Talk 4: Reinforcement Learning with Unsupervised Auxiliary Tasks Video starts at 1:15:30
17.40 - 18.00 Contributed Talk 5: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic Video starts at 1;37;10

18.00 - 18.30 Group photo at the RCT Stadium

Morning Session – Session Chair: Slav Petrov
9.00 - 9.40 Invited talk 1: Regina Barzilay Moving beyond supervised realm Video starts at 3;15 + last 8 minutes of this presentation on this video
9.40 - 10.00 Contributed talk 1: Learning End-to-End Goal-Oriented Dialog Video starts at 27:34
10.00 - 10.20 Contributed talk 2: Multi-Agent Cooperation and the Emergence of (Natural) Language Video starts at 6:00
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)

Afternoon Session – Session Chair: Navdeep Jaitly
14.30 - 15.10 Invited talk 2: Alex Graves, New Direction for Recurent Neural Networks, Video starts at 4:10

15.10 - 15.30 Contributed Talk 3: Making Neural Programming Architectures Generalize via Recursion - BEST PAPER AWARD , Video starts at 50:12

15.30 - 15.50 Contributed Talk 4: Neural Architecture Search with Reinforcement Learning Video starts at 1:08:44

15.50 - 16.10 Contributed Talk 5: Optimization as a Model for Few-Shot Learning Video starts at 1:26:30

16.30 - 18.30 Poster Session 2 (Conference Papers, Workshop Papers)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, April 25, 2017

#ICLR2017 Tuesday Afternoon Program

ICLR 2017 continues this afternoon in Toulon, there will be a blog post for each half day that features directly links to papers from the Open review section. The meeting will be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.and we're hiring.

14.00 - 16.00 Poster Session 2 (Conference Papers, Workshop Papers)
16.00 - 16.15 Coffee Break
16.15 - 17.00 Invited talk 2: Riccardo Zecchina
17.00 - 17.20 Contributed Talk 3: Learning to Act by Predicting the Future
17.20 - 17.40 Contributed Talk 4: Reinforcement Learning with Unsupervised Auxiliary Tasks
17.40 - 18.00 Contributed Talk 5: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
18.00 - 18.10 Group photo at the Stade Félix Mayol
19.00 - 24.00 Gala dinner offered by ICLR

C1: Sigma Delta Quantized Networks
( code)
C2: Paleo: A Performance Model for Deep Neural Networks
C3: DeepCoder: Learning to Write Programs
C4: Topology and Geometry of Deep Rectified Network Optimization Landscapes
C5: Incremental Network Quantization: Towards Lossless CNNs with Low-precision Weights
C6: Learning to Perform Physics Experiments via Deep Reinforcement Learning
C7: Decomposing Motion and Content for Natural Video Sequence Prediction
C8: Calibrating Energy-based Generative Adversarial Networks
C9: Pruning Convolutional Neural Networks for Resource Efficient Inference
C10: Incorporating long-range consistency in CNN-based texture generation
( code )
C11: Lossy Image Compression with Compressive Autoencoders
C12: LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation
C13: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
C14: Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
C15: Mollifying Networks
C16: beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
C17: Categorical Reparameterization with Gumbel-Softmax
C18: Online Bayesian Transfer Learning for Sequential Data Modeling
C19: Latent Sequence Decompositions
C20: Density estimation using Real NVP
C21: Recurrent Batch Normalization
C22: SGDR: Stochastic Gradient Descent with Restarts
C23: Variable Computation in Recurrent Neural Networks
C24: Deep Variational Information Bottleneck
C25: SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
C26: TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency
C27: Frustratingly Short Attention Spans in Neural Language Modeling
C28: Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax
C29: LEARNING A NATURAL LANGUAGE INTERFACE WITH NEURAL PROGRAMMER
C30: Designing Neural Network Architectures using Reinforcement Learning
C31: Metacontrol for Adaptive Imagination-Based Optimization (spaceship dataset )
C32: Recurrent Environment Simulators
C33: EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

W1: Lifelong Perceptual Programming By Example
W2: Neu0
W3: Dance Dance Convolution
W4: Bit-Pragmatic Deep Neural Network Computing
W5: On Improving the Numerical Stability of Winograd Convolutions
W6: Fast Generation for Convolutional Autoregressive Models
W7: THE PREIMAGE OF RECTIFIER NETWORK ACTIVITIES
W8: Training Triplet Networks with GAN
W9: On Robust Concepts and Small Neural Nets
W10: Pl@ntNet app in the era of deep learning
W11: Exponential Machines
W12: Online Multi-Task Learning Using Biased Sampling
W13: Online Structure Learning for Sum-Product Networks with Gaussian Leaves
W14: A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Samples
W15: Compositional Kernel Machines
W16: Loss is its own Reward: Self-Supervision for Reinforcement Learning
W17: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
W18: Precise Recovery of Latent Vectors from Generative Adversarial Networks
W19: Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (code)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, April 21, 2017

Random Feature Expansions for Deep Gaussian Processes / AutoGP: Exploring the Capabilities and Limitations of Gaussian Process Models - implementation -

[I will be at ICLR next week, let's grab some coffee if you are there]

Random Feature Expansions for Deep Gaussian Processes by Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone

The composition of multiple Gaussian Processes as a Deep Gaussian Process (DGP) enables a deep probabilistic nonparametric approach to flexibly tackle complex machine learning problems with sound quantification of uncertainty. Existing inference approaches for DGP models have limited scalability and are notoriously cumbersome to construct. In this work, we introduce a novel formulation of DGPs based on random feature expansions that we train using stochastic variational inference. This yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty. We extensively showcase the scalability and performance of our proposal on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers.

A python / TensorFlow implementation can be found here: https://github.com/mauriziofilippone/deep_gp_random_features

AutoGP: Exploring the Capabilities and Limitations of Gaussian Process Models by Karl Krauth, Edwin V. Bonilla, Kurt Cutajar, Maurizio Filippone

We investigate the capabilities and limitations of Gaussian process models by jointly exploring three complementary directions: (i) scalable and statistically efficient inference; (ii) flexible kernels; and (iii) objective functions for hyperparameter learning alternative to the marginal likelihood. Our approach outperforms all previously reported GP methods on the standard MNIST dataset; performs comparatively to previous kernel-based methods using the RECTANGLES-IMAGE dataset; and breaks the 1% error-rate barrier in GP models using the MNIST8M dataset, showing along the way the scalability of our method at unprecedented scale for GP models (8 million observations) in classification problems. Overall, our approach represents a significant breakthrough in kernel methods and GP models, bridging the gap between deep learning approaches and kernel machines.

and here is a recent presentation by one of the author: "Practical and Scalable Inference for Deep Gaussian Processes"

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, March 25, 2017

Saturday Morning Video: #NIPS2016 Symposium, Recurrent Neural Networks and Other Machines that Learn Algorithms

From the page of the minisymposium:

Program
Full session videos are available here: Session 1, Session 2, Session 3.
We provide individual videos and slides below. You can also watch this Playlist.

2:00 - 2:20 Jürgen Schmidhuber
Introduction to Recurrent Neural Networks and Other Machines that Learn Algorithms
Slides Video

2:20 - 2:40 Paul Werbos
Deep Learning in Recurrent Networks: From Basics To New Data on the Brain
Slides Video

2:40 - 3:00 Li Deng
Three Cool Topics on RNN
Slides Video

3:00 - 3:20 Risto Miikkulainen
Scaling Up Deep Learning through Neuroevolution
Slides Video

3:20 - 3:40 Jason Weston
New Tasks and Architectures for Language Understanding and Dialogue with Memory
Slides Video

3:40 - 4:00 Oriol Vinyals
Recurrent Nets Frontiers
Slides Unavailable Video

4:00 - 4:30 Coffee Break

4:30 - 4:50 Mike Mozer
Neural Hawkes Process Memories
Slides Video

4:50 - 5:10 Ilya Sutskever
Meta Learning in the Universe
Slides Video

5:10 - 5:30 Marcus Hutter
Asymptotically fastest solver of all well-defined problems
Slides Video

(unfortunately cannot come - J. Schmidhuber will stand in for him)

5:30 - 5:50 Nando de Freitas
Learning to Learn, to Program, to Explore and to Seek Knowledge
Slides Video

5:50 - 6:10 Alex Graves
Differentiable Neural Computer
Slides Video

6:30 - 7:30 Light dinner break/Posters

7:30 - 7:50 Nal Kalchbrenner
Generative Modeling as Sequence Learning
Slides Video

7:50 - 9:00 Panel Discussion
Topic: The future of machines that learn algorithms
Panelists: Ilya Sutskever, Jürgen Schmidhuber, Li Deng, Paul Werbos, Risto Miikkulainen, Sepp Hochreiter
Moderator: Alex Graves
Video

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, March 20, 2017

Learning in the Machine: Random Backpropagation and the Learning Channel

Carlos Perez's blog entry on Medium entitled Deep Learning: The Unreasonable Effectiveness of Randomness just led me to the following paper I had not read before (probably because it came out during NIPS). I also added the latest version of Arild Nokland's earlier paper on a similar idea that was itself published at NIPS (and featured on Nuit Blanche).

Learning in the Machine: Random Backpropagation and the Learning Channel by Pierre Baldi, Peter Sadowski, Zhiqin Lu

Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requirement of maintaining symmetric weights in a physical neural system. To better understand random backpropagation, we first connect it to the notions of local learning and the learning channel. Through this connection, we derive several alternatives to RBP, including skipped RBP (SRPB), adaptive RBP (ARBP), sparse RBP, and their combinations (e.g. ASRBP) and analyze their computational complexity. We then study their behavior through simulations using the MNIST and CIFAR-10 bechnmark datasets. These simulations show that most of these variants work robustly, almost as well as backpropagation, and that multiplication by the derivatives of the activation functions is important. As a follow-up, we study also the low-end of the number of bits required to communicate error information over the learning channel. We then provide partial intuitive explanations for some of the remarkable properties of RBP and its variations. Finally, we prove several mathematical results, including the convergence to fixed points of linear chains of arbitrary length, the convergence to fixed points of linear autoencoders with decorrelated data, the long-term existence of solutions for linear systems with a single hidden layer, and the convergence to fixed points of non-linear chains, when the derivative of the activation functions is included.

Direct Feedback Alignment Provides Learning in Deep Neural Networks by Arild Nøkland

Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently discovered method called feedback-alignment shows that the weights used for propagating the error backward don't have to be symmetric with the weights used for propagation the activation forward. In fact, random feedback weights work evenly well, because the network learns how to make the feedback useful. In this work, the feedback alignment principle is used for training hidden layers more independently from the rest of the network, and from a zero initial condition. The error is propagated through fixed random feedback connections directly from the output layer to each hidden layer. This simple method is able to achieve zero training error even in convolutional networks and very deep networks, completely without error back-propagation. The method is a step towards biologically plausible machine learning because the error signal is almost local, and no symmetric or reciprocal weights are required. Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. If combined with dropout, the method achieves 1.45% error on the permutation invariant MNIST task.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, January 25, 2017

Continuing To Learn the Structure of Learning

Following up on the Learning the structure of learning blog entry, here are a few entries mentioned in the open review of Neural Architecture Search with Reinforcement Learning which required a total of 12,800 different architectures trials.

Learning to reinforcement learn by Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick

In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.

Learning to Learn for Global Optimization of Black Box Functions by Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Nando de Freitas

We present a learning to learn approach for training recurrent neural networks to perform black-box global optimization. In the meta-learning phase we use a large set of smooth target functions to learn a recurrent neural network (RNN) optimizer, which is either a long-short term memory network or a differentiable neural computer. After learning, the RNN can be applied to learn policies in reinforcement learning, as well as other black-box learning tasks, including continuous correlated bandits and experimental design. We compare this approach to Bayesian optimization, with emphasis on the issues of computation speed, horizon length, and exploration-exploitation trade-offs.

Towards Automatically-Tuned Neural Networks by Hector Mendoza, Aaron Klein, Matthias Feurer, Jost Tobias Springenberg, Frank Hutter

Recent advances in AutoML have led to automated tools that can compete with machine learning experts on supervised learning tasks. However, current AutoML tools do not yet support modern neural networks effectively.
In this work, we present a first version of Auto-Net, which provides automatically-tuned feed-forward neural networks without any human intervention. We report results on datasets from the recent AutoML challenge showing that ensembling Auto-Net with Auto-sklearn often performs better than either alone, and report the first results on winning a competition dataset against human experts with automatically-tuned neural networks.

Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures by James Bergstra, Daniel Yamins, David Cox

Many computer vision algorithms depend on configuration settings that are typically hand-tuned in the course of evaluating the algorithm for a particular data set. While such parameter tuning is often presented as being incidental to the algorithm, correctly setting these parameter choices is frequently critical to realizing a method’s full potential. Compounding matters, these parameters often must be re-tuned when the algorithm is applied to a new problem domain, and the tuning process itself often depends on personal experience and intuition in ways that are hard to quantify or describe. Since the performance of a given technique depends on both the fundamental quality of the algorithm and the details of its tuning, it is sometimes difficult to know whether a given technique is genuinely better, or simply better tuned. In this work, we propose a meta-modeling approach to support automated hyperparameter optimization, with the goal of providing practical tools that replace hand-tuning with a reproducible and unbiased optimization process. Our approach is to expose the underlying expression graph of how a performance metric (e.g. classification accuracy on validation examples) is computed from hyperparameters that govern not only how individual processing steps are applied, but even which processing steps are included. A hyperparameter optimization algorithm transforms this graph into a program for optimizing that performance metric. Our approach yields state of the art results on three disparate computer vision problems: a face-matching verification task (LFW), a face identification task (PubFig83) and an object recognition task (CIFAR-10), using a single broad class of feed-forward vision architectures.

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem. Instead of aiming to select a single optimal architecture, we propose a "fabric" that embeds an exponentially large number of architectures. The fabric consists of a 3D trellis that connects response maps at different layers, scales, and channels with a sparse homogeneous local connectivity pattern. The only hyper-parameters of a fabric are the number of channels and layers. While individual architectures can be recovered as paths, the fabric can in addition ensemble all embedded architectures together, sharing their weights where their paths overlap. Parameters can be learned using standard methods based on back-propagation, at a cost that scales linearly in the fabric size. We present benchmark results competitive with the state of the art for image classification on MNIST and CIFAR10, and for semantic segmentation on the Part Labels dataset.

Image Credit: NASA/JPL-Caltech/Univ. of Arizona, Earth and the Moon as seen from HiRiSe

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, January 20, 2017

Learning the structure of learning

If anything, there has been a flurry of effort in learning the structure of new learning architectures. Here is an ICLR2017 paper on the subject of meta learning and posters of the recent NIPS symposium on the topic.

Neural Architecture Search with Reinforcement Learning, Barret Zoph, Quoc Le (Open Review is here)

Abstract: Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.

At NIPS, we had the Symposium on Recurrent Neural Networks and Other Machine that Learns Algorithms

Jürgen Schmidhuber, Introduction to Recurrent Neural Networks and Other Machines that Learn Algorithms
Paul Werbos, Deep Learning in Recurrent Networks: From Basics To New Data on the Brain
Li Deng, Three Cool Topics on RNN
Risto Miikkulainen, Scaling Up Deep Learning through Neuroevolution
Jason Weston, New Tasks and Architectures for Language Understanding and Dialogue with Memory
Oriol Vinyals, Recurrent Nets Frontiers
Mike Mozer, Neural Hawkes Process Memories
Ilya Sutskever, Using a slow RL algorithm to learn a fast RL algorithm using recurrent neural networks (Arxiv)
Marcus Hutter, Asymptotically fastest solver of all well-defined problems
Nando de Freitas , Learning to Learn, to Program, to Explore and to Seek Knowledge (Video)
Alex Graves, Differentiable Neural Computer
Nal Kalchbrenner, Generative Modeling as Sequence Learning
Panel Discussion Topic: The future of machines that learn algorithms, Panelists: Ilya Sutskever, Jürgen Schmidhuber, Li Deng, Paul Werbos, Risto Miikkulainen, Sepp Hochreiter, Moderator: Alex Graves

Posters of the recent NIPS2016 workshop

Real-time interactive sequence generation and control with Recurrent Neural Network ensembles by Memo Akten and Mick Grierson
A Neural Forth Abstract Machine by Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky and Sebastian Riedel
Log-Linear RNNs : Towards Recurrent Neural Networks with Flexible Prior Knowledge by Marc Dymetman and Chunyang Xiao
Similarity-based LSTMs for Time Series Representation Learning in the Presence of Structured Covariates by Madalina Fiterau, Jason Fries, Eni Halilaj, Nopphon Siranart, Suvrat Bhooshan and Christopher Ré
Neural Machine Translation with Characters and Hierarchical Encoding by Alexander Rosenberg Johansen, Jonas Meinertz Hansen, Elias Khazen Obeid, Casper Kaae Sønderby and Ole Winther
Supervised learning with information penalties by Artemy Kolchinsky and David H. Wolpert
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations by David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron Courville and Christopher Pal
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision by Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus and Ni Lao
Recurrent Highway Networks by Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník and Jürgen Schmidhuber

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Nuit Blanche Referenced in the Dead Tree World!

The Big Picture in Compressive Sensing was mentioned in an article of La Recherche, the french speaking equivalent/competitor to Science. October 2010 issue, page 20-21.
Wired Magazine had a piece on Compressed Sensing featuring links to this blog and the Big Picture. (March 1, 2010)
Emmanuel Candes and Terry Tao wrote about Nuit Blanche in the Dec. '08 issue of the IEEE Information Theory Society Newsletter
Xiaochuan Pan, Emil Sidky and Michael Vannier wrote about Nuit Blanche in Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction?.
Check also the acknowledgments in this Ghost Imaging paper and this one.

Nuit Blanche

Page Views on Nuit Blanche since July 2010

Tuesday, May 02, 2017

Designing Neural Network Architectures using Reinforcement Learning

Monday, May 01, 2017

Online Natural Gradient as a Kalman Filter

Saturday, April 29, 2017

Saturday Morning Videos: #ICLR2017 videos

Tuesday, April 25, 2017

#ICLR2017 Tuesday Afternoon Program

Friday, April 21, 2017

Random Feature Expansions for Deep Gaussian Processes / AutoGP: Exploring the Capabilities and Limitations of Gaussian Process Models - implementation -

Saturday, March 25, 2017

Saturday Morning Video: #NIPS2016 Symposium, Recurrent Neural Networks and Other Machines that Learn Algorithms

Monday, March 20, 2017

Learning in the Machine: Random Backpropagation and the Learning Channel

Wednesday, January 25, 2017

Continuing To Learn the Structure of Learning

Friday, January 20, 2017

Learning the structure of learning

Printfriendly

2:00 - 2:20	Jürgen Schmidhuber Introduction to Recurrent Neural Networks and Other Machines that Learn Algorithms Slides Video
2:20 - 2:40	Paul Werbos Deep Learning in Recurrent Networks: From Basics To New Data on the Brain Slides Video
2:40 - 3:00	Li Deng Three Cool Topics on RNN Slides Video
3:00 - 3:20	Risto Miikkulainen Scaling Up Deep Learning through Neuroevolution Slides Video
3:20 - 3:40	Jason Weston New Tasks and Architectures for Language Understanding and Dialogue with Memory Slides Video
3:40 - 4:00	Oriol Vinyals Recurrent Nets Frontiers Slides Unavailable Video

4:00 - 4:30	Coffee Break

4:30 - 4:50	Mike Mozer Neural Hawkes Process Memories Slides Video
4:50 - 5:10	Ilya Sutskever Meta Learning in the Universe Slides Video
5:10 - 5:30	Marcus Hutter Asymptotically fastest solver of all well-defined problems Slides Video
	(unfortunately cannot come - J. Schmidhuber will stand in for him)
5:30 - 5:50	Nando de Freitas Learning to Learn, to Program, to Explore and to Seek Knowledge Slides Video
5:50 - 6:10	Alex Graves Differentiable Neural Computer Slides Video

6:30 - 7:30	Light dinner break/Posters

7:30 - 7:50	Nal Kalchbrenner Generative Modeling as Sequence Learning Slides Video
7:50 - 9:00	Panel Discussion Topic: The future of machines that learn algorithms Panelists: Ilya Sutskever, Jürgen Schmidhuber, Li Deng, Paul Werbos, Risto Miikkulainen, Sepp Hochreiter Moderator: Alex Graves Video