We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to tradeoff exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.
Page Views on Nuit Blanche since July 2010
Nuit Blanche community
@NuitBlog || Facebook || Reddit
Compressive Sensing on LinkedIn
Advanced Matrix Factorization on Linkedin ||
Thursday, June 29, 2017
Learning to Learn without Gradient Descent by Gradient Descent
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment