Nuit Blanche: 09/01/2016

Friday, September 30, 2016

CfP: ESANN2017 Special sessions, 26 - 28 April 2017

Matthieu just sent me the following:

Dear Igor,

I hope you are doing well. I am writing you because I would really appreciate if you could share some information which should interest your blog readers.

Gilles Delmaire, Gilles Roussel, and myself are organizing a special session on Environmental Signal Processing for the 25th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN https://www.elen.ucl.ac.be/esann/) but it would be fairer to promote all the special sessions (https://www.elen.ucl.ac.be/esann/index.php?pg=specsess), as all of them may interest the Nuit Blanche readers. I think it will really be worth attending ESANN in April 2017, in the magnificent city of Brugge!
All the special session organizers sent submission invitations to potential participants but, at least in our case, if we forget a Nuit Blanche reader who wish to submit a paper in our session, we would appreciate to be firstly contacted.
Thank you in advance for your help.
Best regards,
Matthieu
-- Matthieu PUIGT, Ph.D.

Thanks Matthieu ! according to the main page, prospective authors are invited to submit their contributions before 19 November 2016.

The following special sessions will be organized at ESANN2017:

Environmental signal processing: new trends and applications
Organized by Gilles Delmaire, Matthieu Puigt, Gilles Roussel (Université du Littoral Côte d'Opale, France)
Biomedical data analysis in translational research: integration of expert knowledge and interpretable models
Organized by Gyan Bhanot (Rutgers University, New Jersey, USA), Michael Biehl (University of Groningen, The Netherlands), Thomas Villmann (Univ. of Applied Sciences Mittweida, Germany), Dietlind Zühlke (Seven Principles, Germany)
Processing, Mining and Visualizing Massive Urban Data
Organized by Etienne Côme (UPE-Ifsttar, France), Pierre Borgnat (ENS Lyon, France), Latifa Oukhellou (UPE-Ifsttar, France)
Randomized Machine Learning approaches: analysis and developments
Organized by Claudio Gallicchio (University of Pisa, Italy), José D. Martín-Guerrero (University of Valencia, Spain), Alessio Micheli (University of Pisa, Italy), Emilio Soria (University of Valencia, Spain)
Deep and kernel methods: best of two worlds
Organized by Lluís A. Belanche, Marta R. Costa-jussà (Universitat Politècnica de Catalunya, Barcelona, Spain)
Algorithmic Challenges in Big Data Analytics
Organized by: Veronica Bolon-Canedo, Amparo Alonso-Betanzos (University of A Coruña, Spain), Beatriz Remeseiro (University of Barcelona, Spain), David Martinez-Rego (University College London, UK), Konstantinos Sechidis (University of Manchester, UK)
Machine Learning in Biomorphic Robots
Organized by Nigel Crook, Matthias Rolf, Tjeerd olde Scheper (Oxford Brookes Univ., UK)

Photo Copyright ESA/Rosetta/MPS for OSIRIS Team MPS/UPD/LAM/IAA/SSO/INTA/UPM/DASP/IDA

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

L1-PCA, Online Sparse PCA and Discretization and Minimization of the L1 Norm on Manifolds

Coming back to some of the themes around Matrix Factorizations, the L1 norm and phase transitions:

Iteratively Reweighted Least Squares Algorithms for L1-Norm Principal Component Analysis by Young Woong Park, Diego Klabjan

Principal component analysis (PCA) is often used to reduce the dimension of data by selecting a few orthonormal vectors that explain most of the variance structure of the data. L1 PCA uses the L1 norm to measure error, whereas the conventional PCA uses the L2 norm. For the L1 PCA problem minimizing the fitting error of the reconstructed data, we propose an exact reweighted and an approximate algorithm based on iteratively reweighted least squares. We provide convergence analyses, and compare their performance against benchmark algorithms in the literature. The computational experiment shows that the proposed algorithms consistently perform best.

Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions by Chuang Wang, Yue M. Lu

We study the dynamics of an online algorithm for learning a sparse leading eigenvector from samples generated from a spiked covariance model. This algorithm combines the classical Oja's method for online PCA with an element-wise nonlinearity at each iteration to promote sparsity. In the high-dimensional limit, the joint empirical measure of the underlying sparse eigenvector and its estimate provided by the algorithm is shown to converge weakly to a deterministic, measure-valued process. This scaling limit is characterized as the unique solution of a nonlinear PDE, and it provides exact information regarding the asymptotic performance of the algorithm. For example, performance metrics such as the cosine similarity and the misclassification rate in sparse support recovery can be obtained by examining the limiting dynamics. A steady-state analysis of the nonlinear PDE also reveals an interesting phase transition phenomenon. Although our analysis is asymptotic in nature, numerical simulations show that the theoretical predictions are accurate for moderate signal dimensions.

Consistent Discretization and Minimization of the L1 Norm on Manifolds by Alex Bronstein, Yoni Choukroun, Ron Kimmel, Matan Sela

The L1 norm has been tremendously popular in signal and image processing in the past two decades due to its sparsity-promoting properties. More recently, its generalization to non-Euclidean domains has been found useful in shape analysis applications. For example, in conjunction with the minimization of the Dirichlet energy, it was shown to produce a compactly supported quasi-harmonic orthonormal basis, dubbed as compressed manifold modes. The continuous L1 norm on the manifold is often replaced by the vector l1 norm applied to sampled functions. We show that such an approach is incorrect in the sense that it does not consistently discretize the continuous norm and warn against its sensitivity to the specific sampling. We propose two alternative discretizations resulting in an iteratively-reweighed l2 norm. We demonstrate the proposed strategy on the compressed modes problem, which reduces to a sequence of simple eigendecomposition problems not requiring non-convex optimization on Stiefel manifolds and producing more stable and accurate results.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, September 29, 2016

A Randomized Tensor Singular Value Decomposition based on the t-product

A Randomized Tensor Singular Value Decomposition based on the t-product by Jiani Zhang, Arvind K. Saibaba, Misha Kilmer, Shuchin Aeron

The tensor Singular Value Decomposition (t-SVD) for third order tensors that was proposed by Kilmer and Martin~\cite{2011kilmer} has been applied successfully in many fields, such as computed tomography, facial recognition, and video completion. In this paper, we propose a method that extends a well-known randomized matrix method to the t-SVD. This method can produce a factorization with similar properties to the t-SVD, but is more computationally efficient on very large datasets. We present details of the algorithm, theoretical results, and provide numerical results that show the promise of our approach for compressing and analyzing datasets. We also present an improved analysis of the randomized subspace iteration for matrices, which may be of independent interest to the scientific community.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis

Interesting that such a simple algorithm could do the job:

Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis by Mostafa Rahmani, George Atia

This paper presents a remarkably simple, yet powerful, algorithm for robust Principal Component Analysis (PCA). In the proposed approach, an outlier is set apart from an inlier by comparing their coherence with the rest of the data points. As inliers lie on a low dimensional subspace, they are likely to have strong mutual coherence provided there are enough inliers. By contrast, outliers do not typically admit low dimensional structures, wherefore an outlier is unlikely to bear strong resemblance with a large number of data points. The mutual coherences are computed by forming the Gram matrix of normalized data points. Subsequently, the subspace is recovered from the span of a small subset of the data points that exhibit strong coherence with the rest of the data. As coherence pursuit only involves one simple matrix multiplication, it is significantly faster than the state of-the-art robust PCA algorithms. We provide a mathematical analysis of the proposed algorithm under a random model for the distribution of the inliers and outliers. It is shown that the proposed method can recover the correct subspace even if the data is predominantly outliers. To the best of our knowledge, this is the first provable robust PCA algorithm that is simultaneously non-iterative, can tolerate a large number of outliers and is robust to linearly dependent outliers

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, September 28, 2016

Large-Scale Strategic Games and Adversarial Machine Learning

Estimating the impact of random projections / Random Features on adversarial Machine Learning, I love it.

From the paper:

Specifically, nonzero-sum large-scale strategic games with high-dimensional continuous decision spaces and random projection methods are investigated as a starting point. Our investigation centers around the reduction of large-scale strategic games using transformations such as random projections and their effect on Nash Equilibrium solutions. Analytically tractable results are presented for quadratic games and in an adversarial machine learning setting.

Large-Scale Strategic Games and Adversarial Machine Learning by Tansu Alpcan, Benjamin I. P. Rubinstein, Christopher Leckie

Decision making in modern large-scale and complex systems such as communication networks, smart electricity grids, and cyber-physical systems motivate novel game-theoretic approaches. This paper investigates big strategic (non-cooperative) games where a finite number of individual players each have a large number of continuous decision variables and input data points. Such high-dimensional decision spaces and big data sets lead to computational challenges, relating to efforts in non-linear optimization scaling up to large systems of variables. In addition to these computational challenges, real-world players often have limited information about their preference parameters due to the prohibitive cost of identifying them or due to operating in dynamic online settings. The challenge of limited information is exacerbated in high dimensions and big data sets. Motivated by both computational and information limitations that constrain the direct solution of big strategic games, our investigation centers around reductions using linear transformations such as random projection methods and their effect on Nash equilibrium solutions. Specific analytical results are presented for quadratic games and approximations. In addition, an adversarial learning game is presented where random projection and sampling schemes are investigated.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, September 27, 2016

LightOn, Forward We Go.

I officially mentioned LightOn back in June (LightOn. And so it begins.). Since then, we've been busy: talking to potential investors, getting some hardware up and running and then some. Here are some of the elements we put out on our press section:

LightOn is listed on the European Machine Intelligence Landscape by @ProjectJunoAI
LightOn is among the top finalists of the Hello Tomorrow Challenge @hellotmrc Top 6 in Data & AI track, September 20, 2016.
LightOn was mentioned in Jack Clark's Import.AI newsletter. September 12, 2016
LightOn has been selected as one of #Top500 deeptech startups of the Hello Tomorrow Challenge. September 6th, 2016
LightOn is featured in "Move over silicon: Machine learning boom means we need new chips" in New Scientist, The article appears in print under the headline “Making light work of AI”. August 23rd, 2016
One of our co-founder, Sylvain Gigan, received the 2016 Fabry - de Gramont prize from the Société Française d'Optique thanks in part to his applied work with LightOn. July 4, 2016

We are also on Twitter: @LightOnIO and LinkedIn.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network

So in previous times, people would talk about upsamling or superresolution ( see this eight year old blog entry entitled CS: Very High Speed Incoherent Projections for Superresolution, and a conference. on how it is done for HDTV) , in these times of the great convergence, CNNs come to the rescue (the second prepritn gives an explanation of one of the step of the first paper that was presented at CVPR:

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network by Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, Zehan Wang

Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.

Is the deconvolution layer the same as a convolutional layer? by Wenzhe Shi, Jose Caballero, Lucas Theis, Ferenc Huszar, Andrew Aitken, Christian Ledig, Zehan Wang

In this note, we want to focus on aspects related to two questions most people asked us at CVPR about the network we presented. Firstly, What is the relationship between our proposed layer and the deconvolution layer? And secondly, why are convolutions in low-resolution (LR) space a better choice? These are key questions we tried to answer in the paper, but we were not able to go into as much depth and clarity as we would have liked in the space allowance. To better answer these questions in this note, we first discuss the relationships between the deconvolution layer in the forms of the transposed convolution layer, the sub-pixel convolutional layer and our efficient sub-pixel convolutional layer. We will refer to our efficient sub-pixel convolutional layer as a convolutional layer in LR space to distinguish it from the common sub-pixel convolutional layer. We will then show that for a fixed computational budget and complexity, a network with convolutions exclusively in LR space has more representation power at the same speed than a network that first upsamples the input in high resolution space.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, September 26, 2016

Book: Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto is here.

The 455 page draft of the second of Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto is here. Here is the table of context:

Preface to the First Edition ix
Preface to the Second Edition xiii
Summary of Notation xvii
1 The Reinforcement Learning Problem ...1
1.1 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . .1
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
1.3 Elements of Reinforcement Learning . . . . . . . . . . . . . . . . . . .6
1.4 Limitations and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . .7
1.5 An Extended Example: Tic-Tac-Toe . . . . . . . . . . . . . . . . . . . 10
1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 History of Reinforcement Learning . . . . . . . . . . . . . . . . . . . . 15
1.8 Bibliographical Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 23
I Tabular Solution Methods....25
2 Multi-arm Bandits...27
2.1 A k-Armed Bandit Problem . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Action-Value Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Incremental Implementation . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Tracking a Nonstationary Problem . . . . . . . . . . . . . . . . . . . . 34
2.5 Optimistic Initial Values . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6 Upper-Con dence-Bound Action Selection . . . . . . . . . . . . . . . . 37
2.7 Gradient Bandit Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 38
2.8 Associative Search (Contextual Bandits) . . . . . . . . . . . . . . . . . 42
2.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
iii
iv
CONTENTS
3 Finite Markov Decision Processes...47
3.1 The Agent{Environment Interface . . . . . . . . . . . . . . . . . . . . 47
3.2 Goals and Rewards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Unified Notation for Episodic and Continuing Tasks . . . . . . . . . . 54
3.5 The Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Markov Decision Processes . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.7 Value Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.8 Optimal Value Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.9 Optimality and Approximation . . . . . . . . . . . . . . . . . . . . . . 72
3.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4 Dynamic Programming....79
4.1 Policy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Policy Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 Policy Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Value Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Asynchronous Dynamic Programming . . . . . . . . . . . . . . . . . . 91
4.6 Generalized Policy Iteration . . . . . . . . . . . . . . . . . . . . . . . . 93
4.7 E ciency of Dynamic Programming . . . . . . . . . . . . . . . . . . . 94
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5 Monte Carlo Methods....99
5.1 Monte Carlo Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2 Monte Carlo Estimation of Action Values . . . . . . . . . . . . . . . . 104
5.3 Monte Carlo Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4 Monte Carlo Control without Exploring Starts . . . . . . . . . . . . . 108
5.5 O -policy Prediction via Importance Sampling . . . . . . . . . . . . . 111
5.6 Incremental Implementation . . . . . . . . . . . . . . . . . . . . . . . . 116
5.7 O -Policy Monte Carlo Control . . . . . . . . . . . . . . . . . . . . . . 118

5.8 Return-Specific Importance Sampling . . . . . . . . . . . . . . . . . . 120
5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6 Temporal-Difference Learning...127
6.1 TD Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Advantages of TD Prediction Methods . . . . . . . . . . . . . . . . . . 131
6.3 Optimality of TD(0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.4 Sarsa: On-Policy TD Control . . . . . . . . . . . . . . . . . . . . . . . 137
CONTENTS
v
6.5 Q-learning: O -Policy TD Control . . . . . . . . . . . . . . . . . . . . 140
6.6 Expected Sarsa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.7 Maximization Bias and Double Learning . . . . . . . . . . . . . . . . . 143
6.8 Games, Afterstates, and Other Special Cases . . . . . . . . . . . . . . 145
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7 Multi-step Bootstrapping.....151
7.1 n-step TD Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.2 n-step Sarsa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.3 n-step O -policy Learning by Importance Sampling . . . . . . . . . . 158
7.4 O -policy Learning Without Importance Sampling:
The n-step Tree Backup Algorithm . . . . . . . . . . . . . . . . . . . . 160
7.5 A Unifying Algorithm: n-stepQ() . . . . . . . . . . . . . . . . . . . . 162
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8 Planning and Learning with Tabular Methods.....167
8.1 Models and Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
8.2 Dyna: Integrating Planning, Acting, and Learning . . . . . . . . . . . 169
8.3 When the Model Is Wrong . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.4 Prioritized Sweeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8.5 Planning as Part of Action Selection . . . . . . . . . . . . . . . . . . . 180
8.6 Heuristic Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
8.7 Monte Carlo Tree Search . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
II Approximate Solution Methods......189
9 On-policy Prediction with Approximation.....191
9.1 Value-function Approximation . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 The Prediction Objective (MSVE) . . . . . . . . . . . . . . . . . . . . 192
9.3 Stochastic-gradient and Semi-gradient Methods . . . . . . . . . . . . . 194
9.4 Linear Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5 Feature Construction for Linear Methods . . . . . . . . . . . . . . . . 203
9.5.1 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.5.2 Fourier Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
9.5.3 Coarse Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.5.4 Tile Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.5.5 Radial Basis Functions . . . . . . . . . . . . . . . . . . . . . . . 215
vi
CONTENTS
9.6 Nonlinear Function Approximation: Arti cial Neural Networks . . . . 216
9.7 Least-Squares TD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
10 On-policy Control with Approximation....229
10.1 Episodic Semi-gradient Control . . . . . . . . . . . . . . . . . . . . . . 229
10.2 n-step Semi-gradient Sarsa . . . . . . . . . . . . . . . . . . . . . . . . 232
10.3 Average Reward: A New Problem Setting for Continuing Tasks . . . . 234
10.4 Deprecating the Discounted Setting . . . . . . . . . . . . . . . . . . . . 238
10.5 n-step Differential Semi-gradient Sarsa . . . . . . . . . . . . . . . . . . 239
10.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11 O -policy Methods with Approximation....243
11.1 Semi-gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.2 Baird's Counterexample . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.3 The Deadly Triad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
12 Eligibility Traces.....251
12.1 The-return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
12.2 TD() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
12.3 An On-line Forward View . . . . . . . . . . . . . . . . . . . . . . . . . 259
12.4 True Online TD() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
12.5 Dutch Traces in Monte Carlo Learning . . . . . . . . . . . . . . . . . . 263
13 Policy Gradient Methods.....265
13.1 Policy Approximation and its Advantages . . . . . . . . . . . . . . . . 266
13.2 The Policy Gradient Theorem . . . . . . . . . . . . . . . . . . . . . . . 268
13.3 REINFORCE: Monte Carlo Policy Gradient . . . . . . . . . . . . . . . 270
13.4 REINFORCE with Baseline . . . . . . . . . . . . . . . . . . . . . . . . 272
13.5 Actor-Critic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
13.6 Policy Gradient for Continuing Problems (Average Reward Rate) . . . 275
13.7 Policy Parameterization for Continuous Actions . . . . . . . . . . . . . 278
III Looking Deeper...280
14 Psychology 281
14.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
14.2 Prediction and Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
CONTENTS
vii
14.3 Classical Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
14.3.1 The Rescorla-Wagner Model . . . . . . . . . . . . . . . . . . . 289
14.3.2 The TD Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
14.3.3 TD Model Simulations . . . . . . . . . . . . . . . . . . . . . . . 292
14.4 Instrumental Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . 301
14.5 Delayed Reinforcement . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
14.6 Cognitive Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
14.7 Habitual and Goal-Directed Behavior . . . . . . . . . . . . . . . . . . . 309
14.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
14.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
14.10Bibliographical and Historical Remarks . . . . . . . . . . . . . . . . . 315
15 Neuroscience....319
15.1 Neuroscience Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
15.2 Reward Signals, Reinforcement Signals, Values, and Prediction Errors 322
15.3 The Reward Prediction Error Hypothesis . . . . . . . . . . . . . . . . 324
15.4 Dopamine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
15.5 Experimental Support for the Reward Prediction Error Hypothesis . . 329
15.6 TD Error/Dopamine Correspondence . . . . . . . . . . . . . . . . . . . 332
15.7 Neural Actor-Critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
15.8 Actor and Critic Learning Rules . . . . . . . . . . . . . . . . . . . . . 342
15.9 Hedonistic Neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
15.10Collective Reinforcement Learning . . . . . . . . . . . . . . . . . . . . 348
15.11Model-Based Methods in the Brain . . . . . . . . . . . . . . . . . . . . 351
15.12Addiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
15.13Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
15.14Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
15.15Bibliographical and Historical Remarks . . . . . . . . . . . . . . . . . 357
16 Applications and Case Studies....365
16.1 TD-Gammon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
16.2 Samuel's Checkers Player . . . . . . . . . . . . . . . . . . . . . . . . . 370
16.3 The Acrobot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
16.4 Watson's Daily-Double Wagering . . . . . . . . . . . . . . . . . . . . 376
16.5 Optimizing Memory Control . . . . . . . . . . . . . . . . . . . . . . . . 379
16.6 Human-Level Video Game Play . . . . . . . . . . . . . . . . . . . . . . 384
16.7 Mastering the Game of Go . . . . . . . . . . . . . . . . . . . . . . . . . 389
viii
CONTENTS
16.8 Personalized Web Services . . . . . . . . . . . . . . . . . . . . . . . . . 396
16.9 Thermal Soaring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
17 Frontiers....403
17.1 The Unified View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
References....407

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sunday, September 25, 2016

Sunday Morning Video: Bay Area Deep Learning School Day 2 Live streaming

The Bay Area Deep Learning School Day 2 starts streaming in 30 minutes at 9:00AM PST / 12PM EST / 5:00PM London time / 6:00PM Paris time and it's all here. The whole schedule is here. Yesterday's video of day 1 has already garnered 16,000 views.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sunday Morning Videos: HORSE2016, On “Horses” and “Potemkin Villages” in Applied Machine Learning

While waiting for day 2 of the Bay Area Deep Learning school in three hours, here are the ten videos of presentations made at the HORSE2016 workshop (On “Horses” and “Potemkin Villages” in Applied Machine Learning) organized by Bob Sturm. Bob came to present something around that theme at the 9th meetup of Season 3 of the Paris Machine Learning meetup. With this workshop, the field and attendant issues are becoming more visible: this is outstanding! as this has bearing on algorithm bias and explainability. To make the video-watching more targeted, Bob even included commentaries with embedded videos in his blog post, here is the beginning of the whole blog entry, you should go there:

September 19, 2016 saw the successful premier edition of HORSE2016: On “Horses” and “Potemkin Villages” in Applied Machine Learning. I have now uploaded videos to the HORSE2016 YouTube channel, and posted slides to the HORSE2016 webpage. I embed the videos below with some commentary.
HORSE2016 had 10 speakers expound on a variety of interesting topics, and about 60 people in the audience. I am extremely pleased that the audience included several people from outside academia, including industry, government employees and artists. This shows how many have recognised the extent to which machine learning and artificial intelligence are impacting our daily lives. The issues explored at HORSE2016 are essential to ensuring this impact remains beneficial and not detrimental.
Here is my introductory presentation, “On Horse Taxonomy and Taxidermy”. This talk is all about “horses” in applied machine learning: what are they? Why is this important and relevant today? Why the metaphor, and why is it appropriate? I present an example “horse,” uncovered using an intervention experiment and a generation experiment. Finally, I discuss what a researcher should do if someone demonstrates their system is a “horse”.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, September 24, 2016

Saturday Morning Video: Bay Area Deep Learning School Day 1 Live streaming

So the Bay Area Deep Learning has already started streaming and it's all here. The whole schedule is here.

Saturday

9:00-10:00

Introduction to Feedforward Neural Networks

Hugo Larochelle

I will cover the design of convolutional neural network (ConvNet) architectures for image understanding, the history of state of the art models on the ImageNet Large Scale Visual Recognition Challenge, and some of the most recent patterns of developments in this area. I will also talk about ConvNet architectures in the context of related visual recognition tasks such as object detection, segmentation, and video processing.

10:15-11:45

Deep Learning for Computer Vision

Andrej Karpathy

12:45-2:15

Deep Learning for NLP

I will describe the foundations of deep learning for natural language processing: word vectors, recurrent neural networks, tasks and models influenced by linguistics. I will end with some recent models that put together all these basic lego blocks into a very powerful deep architecture called dynamic memory network.

Richard Socher

2:45-3:45

Tensorflow Tutorial

Sherry Moore

4:00-5:30

Foundations of Deep Unsupervised Learning

Building intelligent systems that are capable of extracting meaningful

representations from high-dimensional data lies at the core of solving many Artificial Intelligence tasks, including visual object recognition, information retrieval, speech perception, and language understanding. In this tutorial I will discuss mathematical basics of many popular unsupervised models, including Sparse Coding, Autoencoders, Restricted Boltzmann Machines (RBMs), Deep Boltzmann Machines (DBMs), and Variational Autoencoders (VAE). I will furtherdemonstrate that these models are capable of extracting useful hierarchical representations from high dimensional data with applications in visual object recognition, information retrieval, and natural language processing. Finally, time permitting, I will briefly discuss models that can generate natural language descriptions (captions) of images, as well as generate images from captions using attention mechanism.

Ruslan Salakhutdinov

6:00-7:00 Nuts and bolts of applying deep learning

Andrew Ng

Saturday.

Sunday.

9:00-10:30

Policy Gradients and Q-Learning: Rise to Power, Rivalry, and Reunification

I'll start by providing an overview of the state of the art in deep reinforcement learning, including recent applications to video games (e.g., Atari), board games (AlphaGo) and simulated robotics. Then I'll give a tutorial introduction to the two methods that lie at the core of these results: policy gradients and Q-learning. Finally, I'll present a new analysis that shows the close similarity between these two methods. A theme of the talk will be to not only ask "what works?", but also "when does it work?" and "why does it work?"; and to find the kind of answers that are actionable for tuning one's implementation and designing better algorithms.

John Schulman

10:45-11:45

Theano Tutorial

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently, on CPU or GPU. Since its introduction, Theano has been one of the most popular frameworks in the machine learning community, and multiple frameworks for deep learning have been built on top of it (Lasagne, Keras, Blocks, ...). This tutorial will focus first on the concepts behind Theano and how to build and evaluate simple expressions, and then we will see how more complex models can be defined and trained.

Patrice Lamblin

12:45-2:15

Deep Learning for Speech

Traditional speech recognition systems are built from numerous modules, each requiring its own challenging engineering. With deep learning it is now possible to create neural networks that perform most of the tasks of a traditional engine "end to end", dramatically simplifying the development of new speech systems and opening a path to human-level performance. In this tutorial, we will walk through the steps for constructing one type of end-to-end system similar to Baidu's "Deep Speech" model. We will put all of the pieces together to form a "scale model" of a state of the art speech system; small-scale versions of the neural networks now powering production speech engines.

Adam Coates

2:45-3:45

Torch Tutorial

Torch is an open platform for scientific computing in the Lua language, with a focus on machine learning, in particular deep learning. Torch is distinguished from other array libraries by having first-class support for GPU computation, and a clear, interactive and imperative style. Further, through the "NN" library, Torch has broad support for building and training neural networks by composing primitive blocks or layers together in compute graphs. Torch, although benefitting from

extensive industry support, is a community owned and community developed ecosystem. All neural net libraries, including Torch NN, TensorFlow and Theano, rely on automatic differentiation (AD) to manage the computation of gradients of complex compositions of functions. I will present some general background on automatic differentiation (AD), which is the fundamental abstraction of gradient based optimization, and demonstrate

Twitter's flexible implementation of AD in the library torch-autograd

Alex Wiltschko

4:00-5:30

Sequence to Sequence Learning for NLP and Speech

I will first present the foundations of sequence to sequence (seq2seq) learning and attention models, and their applications in machine translation and speech recognition. Then I will discuss attention with pointers and functions. Finally I will describe how reinforcement learning can play a role in seq2seq and attention models.

Quoc Le

6:00-7:00

Foundations and Challenges of Deep Learning

Why is deep learning working as well as it does? What are some big challenges that remain ahead? This talk will first survey some key factors in the success of deep learning. First, from the context of the no-free lunch theorem, we will discuss the expressive power of deep netwroks to capture abstract distributed representations. Second, we will discuss our surprising ability to actually optimize the parameters of neural networks in spite of their non-convexity. We will then consider a few challenges ahead, including the core representation question of disentangling the underlying explanatory factors of variation, especially with unsupervised learning, why this is important for bringing reinforcement learning to the next level, and optimization questions that remain challenging, such as learning of long-term dependencies, understanding the optimization landscape of deep networks, and how learning in brains remain a mystery worth attacking from the deep learning perspective.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, September 23, 2016

It's Friday, it's Hamming's time: Call for deep learning research problems from your "problem surplus" stack

Francois Chollet the creator of Keras tweeted the following interesting proposition:

We're still collecting interesting research problems for our open science initiative: https://t.co/yIqIhTAbSS -don't hesitate to contribute!
— François Chollet (@fchollet) 22 septembre 2016

Here it is:

Call for deep learning research problems

Are you a researcher? Then you probably have a "problem surplus": a list of interesting and important research problems that you don't have time to work on yourself. What if you could outsource some of these problems to distributed teams of motivated students and independent researchers looking to build experience in deep learning?You just have to submit the description of your problem, some pointers as to how to get started, and provide lightweight supervision along the way (occasionally answer questions, provide feedback, suggest experiments to try...).
What you get out of this:

Innovative solutions to research problems that matter to you.

Full credits for the value you provide along the research process.

New contacts among bright people outside of your usual circles.

A fun experience.

We are looking for both deep learning research problems, and problems from other fields that could be solved using deep learning.
Note that the information you submit here may be made public (except for your contact information). We will create a website listing the problems submitted, where people will be able to self-organize into teams dedicated to specific problems. You will be in contact with the people working on your problem via a mailing list. The research process will take place in the open, with communications being publicly available and code being released on GitHub.

Here are some problems:

Enhanced NAVCAM image of Comet 67P/C-G taken on 18 September 2016, 12.1 km from the nucleus centre. The scale is 1.0 m/pixel and the image measures about 1.1 km across. Credits: ESA/Rosetta/NAVCAM – CC BY-SA IGO 3.0

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

CSjobs: Lecturers (Assistant Professors) in: Machine Learning & Computer Vision; Robot Vision & Autonomous Systems

Mark just sent me the following:

Dear Igor, I thought that the Lecturer (Assistant Professor) jobs below may be of interest to Nuit Blanche readers. Best wishes, Mark

Here is the announcement:
----

Lecturers (Assistant Professors) in: Machine Learning & Computer Vision; Robot Vision & Autonomous Systems
Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey, UK
Salary: GBP 39,324 to 46,924 per annum
Closing Date: Monday 31 October 2016
https://jobs.surrey.ac.uk/070216
The University offers a unique opportunity for two individuals with outstanding research and leadership to join the Centre for Vision, Speech and Signal Processing (CVSSP).
The successful candidate is expected to build a research project portfolio to complement existing CVSSP strengths. The centre seeks to appoint two individuals with an excellent research track-record and international profile to lead future growth of research activities in one or more of the following areas:
* Machine Learning & Pattern Recognition * Computer Vision * Robot Vision & Autonomous Systems * Intelligent Sensing and Sensor Networks * Audio-Visual Signal and Media Processing * Big Visual Data Understanding * Machine Intelligence
We now seek individuals with strong research track-records and leadership potential who can develop the existing activities of CVSSP and exploit the synergetic possibilities that exist within the centre, across the University and regionally with UK industry. You will possess proven management and leadership qualities, demonstrating achievements in scholarship and research at a national and international level, and will have experience of teaching within HE.
CVSSP is one of the primary centres for computer vision & audio-visual signal processing in Europe with over 120 researchers, a grant portfolio of £18M and a track-record of pioneering research leading to technology transfer in collaboration with UK industry. CVSSP forms part of the Department of Electronic Engineering, recognised as a top department for both Teaching and Research: surrey.ac.uk/ee.
For an informal discussion, please contact Professor Adrian Hilton, Director of CVSSP (a.hilton@surrey.ac.uk). Further details of CVSSP:www.surrey.ac.uk/cvssp
Interviews are expected to take place in the week commencing 21st November 2016.
Further details: https://jobs.surrey.ac.uk/070216
We can offer a generous remuneration package, which includes relocation assistance where appropriate, an attractive research environment, the latest teaching facilities, and access to a variety of staff development opportunities.
We acknowledge, understand and embrace diversity.
-- Prof Mark D Plumbley Professor of Signal Processing Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey Guildford, Surrey, GU2 7XH, UK Email: m.plumbley@surrey.ac.uk

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Pages

Friday, September 30, 2016

Thursday, September 29, 2016

Wednesday, September 28, 2016

Tuesday, September 27, 2016

Monday, September 26, 2016

Sunday, September 25, 2016

Saturday, September 24, 2016

Friday, September 23, 2016