Monday, May 28, 2018

Adversarial Noise Layer: Regularize Neural Network By Adding Noise / Training robust models using Random Projection (implementation)

Using random projection to train models is a thing:




In this paper, we introduce a novel regularization method called Adversarial Noise Layer (ANL), which significantly improve the CNN's generalization ability by adding adversarial noise in the hidden layers. ANL is easy to implement and can be integrated with most of the CNN-based models. We compared the impact of the different type of noise and visually demonstrate that adversarial noise guide CNNs to learn to extract cleaner feature maps, further reducing the risk of over-fitting. We also conclude that the model trained with ANL is more robust to FGSM and IFGSM attack. Code is available at: this https URL


Regularization plays an important role in machine learning systems. We propose a novel methodology for model regularization using random projection. We demonstrate the technique on neural networks, since such models usually comprise a very large number of parameters, calling for strong regularizers. It has been shown recently that neural networks are sensitive to two kinds of samples: (i) adversarial samples, which are generated by imperceptible perturbations of previously correctly-classified samples-yet the network will misclassify them; and (ii) fooling samples, which are completely unrecognizable, yet the network will classify them with extremely high confidence. In this paper, we show how robust neural networks can be trained using random projection. We show that while random projection acts as a strong regularizer, boosting model accuracy similar to other regularizers, such as weight decay and dropout, it is far more robust to adversarial noise and fooling samples. We further show that random projection also helps to improve the robustness of traditional classifiers, such as Random Forrest and Gradient Boosting Machines.




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, May 07, 2018

IMPAC IMaging-PsychiAtry Challenge: predicting autism A data challenge on Autism Spectrum Disorder detection


I usually don't do advertizement for challenges but this one is worth it. Balazs just sent me this:
Dear All, 
The Paris-Saclay CDS, Insitut Pasteur, and IESF are launching the Autism Spectrum Disorder (ASD) classification event on RAMP.studio. ASD is a severe psychiatric disorder that affects 1 in 166 children. There is evidence that ASD is reflected in individuals brain networks and anatomy. Yet, it remains unclear how systematic these effects are and how large is their predictive power. The large cohort assembled here can bring some answers. Predicting autism from brain imaging will provide biomarkers and shed some light on the mechanisms of the pathology. 
The goal of the challenge is to predict ASD (binary classification) from pre-processed structural and functional MRI on more than 2000 subjects. 
The RAMP will run in competitive mode until July 1st at 20h (UTC) and in collaborative (open code) mode between July 1st and the closing ceremony on July 6-7th. The starting kit repo provides detailed instructions on how to start. You can sign up at the Autism RAMP event.
Prizes
The Paris-Saclay CDS and IESF are sponsoring the competitive phase of the event:
  • 1st prize 3000€
  • 2nd prize 2000€
  • 3rd prize 1000€
  • from 4th to 10th place 500 €

Launching hackathon
For those in the Paris area, we are organizing a launching hackaton at La Paillasse on May 14. Please sign up here if you are interested.
For more information please visit the event web page and join the slack team, #autism channel.
Best regards,
Balazs  













Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, April 30, 2018

Video, Preprint and Implementation: Measuring the Intrinsic Dimension of Objective Landscapes

While waiting for the Workshop on the Future of Random Projections this coming Wednesday (you can register here) . Here is a video that present a paper that will be featured at ICLR this week that talks about random projections !





In this video from Uber AI Labs, researchers Chunyuan Li and Jason Yosinski describe their ICLR 2018 paper "Measuring the Intrinsic Dimension of Objective Landscapes". The research, performed with co-authors Heerad Farkhoor and Rosanne Liu, develops intrinsic dimension as a fundamental property of neural networks. Intrinsic dimension quantifies the complexity of a model in a manner decoupled from its raw parameter count, and the paper provides a simple way of measuring this dimension using random projections. Many problems have smaller intrinsic dimension than one might suspect. By using intrinsic dimension to compare across problem domains, one may measure, for example, that solving the inverted pendulum problem is about 100 times easier than classifying digits from MNIST, and playing Atari Pong from pixels is about as hard as classifying CIFAR-10.





Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. We slowly increase the dimension of this subspace, note at which dimension solutions first appear, and define this to be the intrinsic dimension of the objective landscape. The approach is simple to implement, computationally tractable, and produces several suggestive conclusions. Many problems have smaller intrinsic dimensions than one might suspect, and the intrinsic dimension for a given dataset varies little across a family of models with vastly different sizes. This latter result has the profound implication that once a parameter space is large enough to solve a problem, extra parameters serve directly to increase the dimensionality of the solution manifold. Intrinsic dimension allows some quantitative comparison of problem difficulty across supervised, reinforcement, and other types of learning where we conclude, for example, that solving the inverted pendulum problem is 100 times easier than classifying digits from MNIST, and playing Atari Pong from pixels is about as hard as classifying CIFAR-10. In addition to providing new cartography of the objective landscapes wandered by parameterized models, the method is a simple technique for constructively obtaining an upper bound on the minimum description length of a solution. A byproduct of this construction is a simple approach for compressing networks, in some cases by more than 100 times.

implementation is here: https://github.com/uber-research/intrinsic-dimension



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, April 27, 2018

Quantized Compressive K-Means

Laurent, a long time reader of Nuit Blanche and one of the speakers at the workshop on the Future of Random Projection II this coming wednesday ( you can register here whether you are in Paris or not so as to receive information on the link for the streaming ) has just released an arxiv on the subject area:



The recent framework of compressive statistical learning aims at designing tractable learning algorithms that use only a heavily compressed representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such a method: it estimates the centroids of data clusters from pooled, non-linear, random signatures of the learning examples. While this approach significantly reduces computational time on very large datasets, its digital implementation wastes acquisition resources because the learning examples are compressed only after the sensing stage. The present work generalizes the sketching procedure initially defined in Compressive K-Means to a large class of periodic nonlinearities including hardware-friendly implementations that compressively acquire entire datasets. This idea is exemplified in a Quantized Compressive K-Means procedure, a variant of CKM that leverages 1-bit universal quantization (i.e. retaining the least significant bit of a standard uniform quantizer) as the periodic sketch nonlinearity. Trading for this resource-efficient signature (standard in most acquisition schemes) has almost no impact on the clustering performances, as illustrated by numerical experiments.





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, April 26, 2018

Mini-workshop: The Future of Random Projections II, 1pm-4pm, May 2nd, 2018, Paris, France




Florent Krzakala and I are organizing a second mini-workshop on The Future of Random Projections  un-originally titled "The Future of Random Projections II".

As data is getting richer, the need to make sense of it is becoming paramount in many different areas. In this context of large scale learning, Random Projections offer a way to be useful in a variety of unsupervised and supervised learning techniques. In this workshop, we will explore the different use of this transform  from the point of you of several research areas featured in each talk.

We will be streaming it during the event and the video will then be on YouTube. For those of you in Paris, it is going to be on May 2nd, 2018 at IPGG. you can register here whether you are in Paris or not so as to receive information on the link for the streaming. The workshop is hosted by LightOn.


Streaming video:




Here is the four main speakers we will have. The event will start at 1:00PM Paris time and should stop on or before 4:00PM.

1:00 pm - 1:30 pm
Title: "Time for dithering! Quantized random embeddings with RIP random matrices."

Abstract: Quantized compressive sensing (QCS) deals with the problem of coding compressive measurements of low-complexity signals (e.g., sparse vectors in a given basis, low-rank matrices) with quantized, finite precision representations, i.e., a mandatory process involved in any practical sensing model. While the resolution of this quantization clearly impacts the quality of signal reconstruction, there even exist incompatible combinations of quantization functions and sensing matrices that proscribe arbitrarily low reconstruction error when the number of measurements increases.

In this introductory talk, we will see that a large class of random matrix constructions, i.e., known to respect the restricted isometry property (RIP) in the compressive sensing literature, can be made "compatible" with a simple scalar and uniform quantizer (e.g., a rescaled rounding operation). This compatibility is simply ensured by the addition of a uniform random vector, or random "dithering", to the compressive signal measurements before quantization.

In this context, we will first study how quantized, dithered random projections of "low-complexity" signals is actually an efficient dimensionality reduction technique that preserves the distances of low-complexity signals up to some controllable additive and multiplicative distortions. Second, the compatibility of RIP sensing matrices with the dithered quantization process will be demonstrated by the existence of (at least) one signal reconstruction method, the projected back projection (PBP), which achieves low reconstruction error, decaying when the number of measurements increases. Finally, by leveraging the quasi-isometry property reached by quantized, dithered random embeddings, we will show how basic signal classification (or clustering) can be realized from their QCS observations, i.e., without a reconstruction step. Here also the complexity, or intrinsic dimension, of the observed signals drives the final classification accuracy.

1:30pm - 2:00pm  Julien Mairal, Inria Grenoble
Title: Foundations of Deep Learning from a Kernel Point of View.

Abstract: In the past years, deep neural networks such as convolutional or recurrent ones have become highly popular for solving various prediction problems, notably in computer vision and natural language processing. Conceptually close to approaches that were developed several decades ago, they greatly benefit from the large amounts of labeled data that have been available recently, allowing to learn huge numbers of model parameters without worrying too much about overfitting. Before the resurgence of neural networks, non-parametric models based on positive definite kernels were one of the most dominant topics in machine learning. These approaches are still widely used today because of several attractive features. Kernel methods are indeed versatile; as long as a positive definite kernel is specified for the type of data considered—e.g., vectors, sequences, graphs, or sets—a large class of machine learning algorithms originally defined for linear models may be used. Kernel methods also admit natural mechanisms to control the learning capacity and reduce overfitting. In this talk, we will consider both paradigms and show how they are related. We will notably show that the reproducing kernel point of view allows to derive theoretical results for classical convolutional neural networks.

2:00pm - 2:10 pm small break

2:10pm - 2:40pm: Dmitry Ulyanov, Skoltech Institute
Title: Deep Image Prior

Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. In order to do so, we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting. Furthermore, the same prior can be used to invert deep neural representations to diagnose them and to restore images based on flash-no flash input pairs.Apart from its diverse applications, our approach highlights the inductive bias captured by standard generator network architectures. It also bridges the gap between two very popular families of image restoration methods: learning-based methods using deep convolutional networks and learning-free methods based on handcrafted image priors such as self-similarity.

2:40pm - 3:10pm: Kurt Cutajar , EURECOM,
Title “Random Feature Expansions for Deep Gaussian Processes”

Abstract: The widespread application of machine learning in safety-critical domains such as medical diagnosis and autonomous driving has sparked a renewed interest in probabilistic models which produce principled uncertainty estimates alongside predictions. The composition of multiple Gaussian processes as a deep Gaussian process (DGP) enables a deep probabilistic nonparametric approach to flexibly tackle complex machine learning problems with sound quantification of uncertainty. However, traditional inference approaches for DGP models have limited scalability and are notoriously cumbersome to construct. Inspired by recent advances in the field of Bayesian deep learning, in this talk I shall present an alternative formulation of DGPs based on random feature expansions. This yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty. The scalability and performance of our proposal is showcased on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers.

3:10pm - 4:00pm Coffee break.


Credit image: Rich Baraniuk

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, April 20, 2018

Videos: Computational Theories of the Brain, Simons Institute for the Theory of Computing



Monday, April 16th, 20188:30 am – 8:50 am
Coffee and Check-In
8:50 am – 9:00 am
Opening Remarks
9:00 am – 9:45 am
The Prefrontal Cortex as a Meta-Reinforcement Learning SystemMatthew Botvinick, DeepMind Technologies Limited, London and University College London9:45 am – 10:30 am
Working Memory Influences Reinforcement Learning Computations in Brain and BehaviorAnne Collins, UC Berkeley10:30 am – 11:00 am
Break
11:00 am – 11:45 am
Predictive Coding Models of PerceptionDavid Cox, Harvard University11:45 am – 12:30 pm
TBASophie Denève, Ecole Normale Supérieure12:30 pm – 2:30 pm
Lunch
2:30 pm – 3:15 pm
Towards Biologically Plausible Deep Learning: Early Inference in Energy-Based Models Approximates Back-PropagationAsja Fischer, University of Bonn
3:15 pm – 4:00 pm
Neural Circuitry Underling Working Memory in the Dorsolateral Prefrontal CortexVeronica Galvin, Yale University
4:00 pm – 5:00 pm
Reception

Tuesday, April 17th, 20188:30 am – 9:00 am
Coffee and Check-In
9:00 am – 9:45 am
TBASurya Ganguli, Stanford University9:45 am – 10:30 am
Does the Neocortex Use Grid Cell-Like Mechanisms to Learn the Structure of Objects?Jeff Hawkins, Numenta
10:30 am – 11:00 am
Break
11:00 am – 11:45 am
Dynamic Neural Network Structures Through Stochastic RewiringRobert Legenstein, Graz University of Technology11:45 am – 12:30 pm
Backpropagation and Deep Learning in the BrainTimothy Lillicrap, DeepMind Technologies Limited, London12:30 pm – 2:30 pm
Lunch
2:30 pm – 3:15 pm
An Algorithmic Theory of Brain NetworksNancy Lynch, Massachusetts Institute of Technology3:15 pm – 4:00 pm
Networks of Spiking Neurons Learn to Learn and RememberWolfgang Maass, Graz University of Technology4:00 pm – 4:30 pm
Break
4:30 pm – 5:30 pm
Plenary Discussion: What Is Missing in Current Theories of Brain Computation?

Wednesday, April 18th, 20188:30 am – 9:00 am
Coffee and Check-In
9:00 am – 9:45 am
Functional Triplet Motifs Underlie Accurate Predictions of Single-Trial Responses in Populations of Tuned and Untuned v1 NeuronsJason MacLean, University of Chicago9:45 am – 10:30 am
The Sparse Manifold TransformBruno Olshausen, UC Berkeley10:30 am – 11:00 am
Break
11:00 am – 11:45 am
Playing Newton: Automatic Construction of Phenomenological, Data-Driven Theories and ModelsIlya Nemenman, Emory University11:45 am – 12:30 pm
A Functional Classification of Glutamatergic Circuits in Cortex and ThalamusS. Murray Sherman, University of Chicago12:30 pm – 2:30 pm
Lunch
2:30 pm – 3:15 pm
On the Link Between Energy & Information for the Design of Neuromorphic SystemsNarayan Srinivasa, Eta Compute3:15 pm – 4:00 pm
Neural Circuit Representation of Multiple Cognitive Tasks: Clustering and CompositionalityXJ Wang, New York University4:00 pm – 4:30 pm
Break
4:30 pm – 5:30 pm
Plenary Discussion: How Can One Test/Falsify Current Theories of Brain Computation?

Thursday, April 19th, 20188:30 am – 9:00 am
Coffee and Check-In
9:00 am – 9:45 pm
Control of Synaptic Plasticity in Deep Cortical NetworksPieter Roelfsema, University of Amsterdam9:45 am – 10:30 am
Computation with AssembliesChristos Papadimitriou, Columbia University10:30 am – 11:00 am
Break
11:00 am – 11:45 am
Capacity of Neural Networks for Lifelong Learning of Composable TasksLes Valiant, Harvard University11:45 am – 12:30 pm
An Integrated Cognitive ArchitectureGreg Wayne, Columbia University

Tuesday, April 17, 2018

Revisiting Skip-Gram Negative Sampling Model With Regularization



Matt just sent me the following

Hi Igor  
I would like to point you to our recent paper on the arXiv: Revisiting Skip-Gram Negative Sampling Model With Regularization (https://arxiv.org/pdf/1804.00306.pdf), which essentially deals with one specific low-rank matrix factorization model.  
The abstract is as follows:
We revisit skip-gram negative sampling (SGNS), a popular neural-network based approach to learning distributed word representation. We first point out the ambiguity issue undermining the SGNS model, in the sense that the word vectors can be entirely distorted without changing the objective value. To resolve this issue, we rectify the SGNS model with quadratic regularization. A theoretical justification, which provides a novel insight into quadratic regularization, is presented. Preliminary experiments are also conducted on Google’s analytical reasoning task to support the modified SGNS model.  
Your opinion will be much appreciated! 
Thanks, Matt Mu

Monday, April 16, 2018

Ce soir/Today: Paris Machine Learning Meetup Hors Série #4 Saison 5: Le Canada et l'IA



C'est un meetup Hors Série exceptionnel organisé conjointement avec l'Ambassade du Canada en France. Nous serons accueillis dans les locaux de Xebia (merci à eux! et leur événement dataXday)  Nous commencerons le meetup à 19h00 et ouvrirons les portes vers 18h30. La video en streaming est ici (les presentations seront sur cette page avant le meetup)





Voici le programme technique:


Le détail des quatre présentations techniques:

Title: “ How we solve Poker”
SPEAKER: Prof. Mike Bowling

Cepheus is our new poker-playing program capable of playing a nearly perfect game of heads-up limit Texas hold'em. It is so close to perfect that even after an entire human lifetime of playing against it, you couldn't be statistically certain it wasn't perfect. We call such a game essentially solved. This work just appeared in Science. You can read the paper. You can query Cepheus about how it plays and play against it. Or you can read the many news articles on the result. Site: http://poker.srv.ualberta.ca/

SPEAKER : Vadim Bulitko, Associate Professor at the University of Alberta, Department of Computing Science

ABSTRACT: Artificial Intelligence is rapidly entering our daily life in the form of smartphone assistants, self-driving cars, etc. While such AI assistants can make our lives easier and safer, there is a growing interest in understanding how long they will remain our intellectual servants. With the powerful applications of self-training and self-learning (e.g., the recent work by Deep Mind on self-learning to play several board games at a championship level), what behaviors will such self-learning AI agents learn? Will there be genuine knowledge discoveries made by them? How much understanding of their novel behavior will we, as humans, be able to gather?
This project builds on our group's 12 years of expertise in developing AI agents learning in a real-time setting and takes a step towards investigating the grand yet pressing questions listed above. We are developing a video-game-like testbed in which we allow our AI agents to evolve over time and learn from their life experience. The agents use genetically encoded deep neural networks to represent behaviors and pass them onto their off-springs in the simulated evolution. A separate deep neural network is then trained to watch the simulation and flag emergence of any unusual behaviours. We expect to study emergence of novel behaviors such as development of friend-foe identification techniques, simple forms of communication, apprenticeship learning and others.
site: http://agi-lab.net

SPEAKER: Martin Müller, Computing Science, University of Alberta

ABSTRACT: I will give a brief overview of recent work in my research group. While the applications are diverse and range from games and Monte Carlo Tree Search to SAT solving, a common goal drives much of the work: to better understand the use of exploration in very large search spaces.
Site: https://webdocs.cs.ualberta.ca/~mmueller/




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly