Sebastien did four lectures on Bandit Convex Optimization for the Gaspard Monge Program in Optimization. Two of them are on Sebastien YouTube channel. Here is the abstract:
The multi-armed bandit and its variants have been around for more than 80 years, with applications ranging from medial trials in the 1930s to ad placement in the 2010s. In this mini-course I will focus on a groundbreaking model introduced in the 1990s which gets rid of the unrealistic i.i.d. assumption that is standard in statistics and learning theory. This paradigm shift leads to exciting new mathematical and algorithmic challenges. I will focus the lectures on the foundational results of this burgeoning field, as well as their connections with classical problems in mathematics such as the geometry of martingales and high dimensional phenomena.
- Lecture 1: Introduction to regret. Game theoretic viewpoint (duality, Bayesian version of the game) and derivation of the minimax regret via geometry of martingales (brief recall of type/cotype and entropic proof for ell_1).
- Lecture 2: Introduction to the mirror descent algorithm. Connections with competitive analysis in online computations will also be discussed.
- Lecture 3: Bandit Linear Optimization. Two proofs of optimal regret: one via low-rank decomposition in the information theoretic argument, and the other via mirror descent with self-concordant barriers.
- Lecture 4 : Bandit Convex optimization 1. Kernel methods for online learning, Bernoulli convolution based kernel. 2. Gaussian approximation of Bernoulli convolutions, and restart type strategies.
Bandit Convex Optimization, PGMO Lecture 1 (slides)
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.