Nuit Blanche: 10/01/2017

Tuesday, October 31, 2017

Linear Dimensionality Reduction in Linear Time: Johnson-Lindenstrauss-type Guarantees for Random Subspace / Structure-aware error bounds for linear classification with the zero-one loss

Bob sent me the following:

Hi Igor,

I have a couple of recent reports posted to the arxiv, which may be of interest to you? One is a data-dependent Johnson-Lindenstrauss type result for feature subsampling, the other is some theory for linear classifiers and randomly-projected linear classifiers.
Links follow: JLL: https://arxiv.org/abs/1705.06408Learning theory: https://arxiv.org/abs/1709.09782

Hope you like them.

ATB Bob

Thanks Bob !

Linear Dimensionality Reduction in Linear Time: Johnson-Lindenstrauss-type Guarantees for Random Subspace by Nick Lim, Robert J. Durrant

We consider the problem of efficient randomized dimensionality reduction with norm-preservation guarantees. Specifically we prove data-dependent Johnson-Lindenstrauss-type geometry preservation guarantees for Ho's random subspace method: When data satisfy a mild regularity condition -- the extent of which can be estimated by sampling from the data -- then random subspace approximately preserves the Euclidean geometry of the data with high probability. Our guarantees are of the same order as those for random projection, namely the required dimension for projection is logarithmic in the number of data points, but have a larger constant term in the bound which depends upon this regularity. A challenging situation is when the original data have a sparse representation, since this implies a very large projection dimension is required: We show how this situation can be improved for sparse binary data by applying an efficient `densifying' preprocessing, which neither changes the Euclidean geometry of the data nor requires an explicit matrix-matrix multiplication. We corroborate our theoretical findings with experiments on both dense and sparse high-dimensional datasets from several application domains.

Structure-aware error bounds for linear classification with the zero-one loss by Ata Kaban, Robert J. Durrant

We prove risk bounds for binary classification in high-dimensional settings when the sample size is allowed to be smaller than the dimensionality of the training set observations. In particular, we prove upper bounds for both 'compressive learning' by empirical risk minimization (ERM) (that is when the ERM classifier is learned from data that have been projected from high-dimensions onto a randomly selected low-dimensional subspace) as well as uniform upper bounds in the full high-dimensional space. A novel tool we employ in both settings is the 'flipping probability' of Durrant and Kaban (ICML 2013) which we use to capture benign geometric structures that make a classification problem 'easy' in the sense of demanding a relatively low sample size for guarantees of good generalization. Furthermore our bounds also enable us to explain or draw connections between several existing successful classification algorithms. Finally we show empirically that our bounds are informative enough in practice to serve as the objective function for learning a classifier (by using them to do so).

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, October 24, 2017

Thesis: Sketching for Large-Scale Learning of Mixture Models by Nicolas Keriven

Congratulations Dr. Keriven ! We've featured some of his work before...

but now we have the whole thesis and Nicolas tells me there is more good stuff in it. Woohoo !.

Sketching for Large-Scale Learning of Mixture Models by Nicolas Keriven

Automatic learning processes are becoming ubiquitous in many domains of science. However, nowadays databases commonly comprise millions or billions of elements, which challenge traditional learning methods. Furthermore, modern database architectures involve new difficulties: data may be seen once then discarded (a situation usually referred to as data stream), often databases are not stored in one single location but distributed across several storage places and it is undesirable to gather the whole database in one place for the sake of privacy and robustness to malicious attacks. It has thus become necessary to derive learning procedures that are amenable to very large databases, and to distributed and streaming computing. A popular idea is to define an intermediary compressed representation of a database, which is fast to compute, adapted to streaming and distributed computing through update and merge mechanisms, preserve data privacy, and such that the desired learning task can be performed using only this compressed representation, with a computational complexity that is greatly reduced compared to using the full database. A popular class of such representations is called linear sketches: the whole database is compressed into a single fixed-size vector called sketch, such that the sketch of the union of two databases is the sum of their sketches. Because of this property it is obvious that linear sketches are particularly convenient for streaming, distributed and parallel computing. In [BGP13; BGP15], Bourrier et al. introduced a learning method based on a linear sketch formed by a random sampling of the empirical characteristic function of a collection of multidimensional vectors. They showed empirically that it was possible to fit a Gaussian Mixture Model (GMM) with fixed identity covariance on the original data, using only its sketch. However, the method was restricted to GMMs with identity covariance, and theoretical justi- fications were still an open question. Extending this method to other models and providing a theoretical analysis of the approach is the main purpose of this thesis work. To do so, we develop an original framework based on several different sets of mathematical tools. The expression of the sketching operator is formalized by combining kernel mean embedding, which allows to define tunable Hilbertian metrics on the set of probability distributions, with Random Feature expansions, that approximate the infinite-dimensional mapping associated with a kernel function by a finite-dimensional mapping designed randomly. Using this mathematical framework, we analyze the sketching method under the lens of Compressive Sensing, which states that any signal that is in some sense less complex than the ambient dimension can be successfully compressed and estimated. We adapt classic proofs for finite-dimensional settings to our generalized infinite-dimensional framework. We provide guarantees for many problems, including for that of estimating mixtures of multivariate elliptic α-stable distributions from a sketch, for which no estimator was known. We particularly extend the framework and relate it to more traditional learning in two cases: first when recovering centroids from a sketch for the k-means or k-medians problem, and for GMM estimation with known covariance. We introduce a flexible heuristic greedy algorithm coined Compressive Learning - Orthogonal Matching Pursuit with Replacement (CL-OMPR) that can estimate any parametric mixture model from any sketch in a very wide variety of situations. Experiments are performed on real and synthetic data for three models. First, mixtures of Diracs, for which our approach is shown to be more efficient and more stable than k-means on large databases; second, GMMs with unknown diagonal covariances, where the proposed approach is seen to be faster and lighter that classic Expectation Maximization (EM). And, finally, mixtures of multivariate elliptic α-stable distributions, where our approach is the first viable algorithm of which we are aware that can perform this task.

Résumé : Les bases de données modernes sont de très grande taille, parfois divisées et distribuées sur plusieurs lieux de stockage, ou encore sous forme de flux de données : ceci soulève de nouveaux défis majeurs pour les méthodes d’apprentissage statistique. Une des méthodes récentes capable de s’adapter à ces situations consiste à d’abord compresser les données en une structure appelée sketch linéaire, puis ensuite de réaliser la tâche d’apprentissage en utilisant uniquement ce sketch, ce qui est extrêmement rapide si celui-ci est de petite taille. Dans cette thèse, nous définissons une telle méthode pour estimer un modèle de mélange de distributions de probabilités à partir des données, en utilisant uniquement un sketch de celles-ci. Ce sketch est défini en s’inspirant de plusieurs notions venant du domaine des méthodes à noyaux : le plongement par noyau moyen et les approximations aléatoires de noyaux. Défini comme tel, le sketch correspond à des mesures linéaires de la distribution de probabilité sous-jacente aux données. Ainsi nous analysons le problème en utilisant des outils venant du domaine de l’acquisition comprimée, dans lequel un signal est mesuré aléatoirement sans perte d’information, sous certaines conditions. Nous étendons certains résultats de l’acquisition comprimée à la dimension infinie, donnons des conditions génériques garantissant le succès de notre méthode d’estimation de modèles de mélanges, et les appliquons à plusieurs problèmes, dont notamment celui d’estimer des mélanges de distributions stables multivariées, pour lequel il n’existait à ce jour aucun estimateur. Notre analyse est basée sur la construction d’opérateurs de sketch construits aléatoirement, qui satisfont une Propriété d’Isométrie Restreinte dans l’espace de Banach des mesures finies signées avec forte probabilité. Dans une second partie, nous introduisons un algorithme glouton capable heuristiquement d’estimer un modèle de mélange depuis un sketch linéaire. Cet algorithme est appliqué sur données simulées et réelles à trois problèmes : l’estimation de centres significatifs dans les données, pour lequel on constate que la méthode de sketch est significativement plus rapide qu’un algorithme de k-moyennes classique, l’estimation de mélanges de Gaussiennes, pour lequel elle est plus rapide qu’un algorithme d’Espérance-Maximisation, et enfin l’estimation de mélange de distributions stables multivariées, pour lequel il n’existait à ce jour, à notre connaissance, aucun algorithme capable de réaliser une telle tâche.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, October 18, 2017

Tonight: Paris Machine Learning Meetup #2 Season 5: Reinforcement Learning for Malware detection, Fraud detection, video scene detection and Climate Change

Tonight, we will have the second Paris Machine Learning meetup of Season 5. We'll talk about Reinforcement Learning for Malware detection, Fraud detection, video scene detection and Climate Change. The video streaming is here:

Thanks to Artefact for hosting and sponsoring us. They will provide the room, food and beverage. Capacity of the room is +/- 90 seats. As usual, we are on a first-come-first-serve, then doors close

Schedule :

6:45PM doors open / 7-9:00PM talks / 9-10:00PM drinks/foods / 10:00PM end

Presentations (the blog post should have the slides before the meetup happens)

Philippe Rolet, (Artefact) , Short presentation of Artefact

Hyrum Anderson, (Endgame) -Remote-, Reinforcement Learning for Evading Machine Learning Malware Detection

Machine learning is a popular approach to signatureless malware detection because it can generalize to never-before-seen malware families and polymorphic strains. This has resulted in its practical use for either primary detection engines or for supplementary heuristic detection by anti-malware vendors. Recent work in adversarial machine learning has shown that deep learning models are susceptible to gradient-based attacks, whereas non-differentiable models that report a score can be attacked by genetic algorithms that aim to systematically reduce the score.
We propose a more general framework based on reinforcement learning (RL) for attacking static portable executable (PE) anti-malware engines. The general framework does not require a differentiable model nor does it require the engine to produce a score. Instead, an RL agent is equipped with a set of functionality-preserving operations that it may perform on the PE file. This enables completely black-box attacks against static PE anti-malware, and produces functional (with practical caveats) evasive malware samples as a direct result.

Olivier Risser-Maroix, Detection of Opening Scenes in Series on video frames

The purpose of this project is to find a method to automatically detect opening sequences in a set of episodes of a TV Show. In fact, when users are watching comics series, skipping generics can make up to 10% of saved time. Furthermore, in very short episodes, when users are watching them one after another, generics become quickly irritating. This presentation shows how basis of anomaly detection and perceptual hashing techniques of a video sequences can lead to a fast effective system able to detect a common part in episodes.

Olivier Corradi, Building a sustainable Tomorrow with data and AI (Tomorrow)

Climate Change is the biggest challenge of our time, as the overwhelming majority of our daily activities still are based on processes that release greenhouse gases, thus causing climate change.

Olivier will present the mission of Tomorrow which is quantify, and make widely accessible, the climate impact of the daily choices we make.

He will present their first initiative: electricitymap.org

Robin Lespes, Gill Morisse, Romain De San Nicolas, Selection Bias in Fraud detection (Quantmetry)

Companies seek new techniques to face innovative fraud attempts. Artificial intelligence provides solutions detecting weak signals and exploiting complex correlations between large number of features. But the resulting decisions modify the fraud behaviours and the nature of generated data. This issue imply a selection bias, which results in the degradation of the algorithms learning data. We faced this new challenge and are going to focus on by presenting the so-called techniques of “inference of the refused”

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, October 17, 2017

Compressed Sensing MRI Reconstruction with Cyclic Loss in Generative Adversarial Networks

Reconstruction solvers requiring too muh time to recontruct images or videos is nothing new as pointed out four years ago ( Sunday Morning Insight: Faster Than a Blink of an Eye). Last week, we saw another example of that in DiffuserCam: Lensless Single-exposure 3D Imaging where one could read:

A typical reconstruction requires at least 200 iterations. Solving for 2048 × 2048 × 128 = 537 million voxels takes 26 minutes (8 seconds per iteration) on a 144-core workstation and requires 85 Gigabytes of RAM. A smaller reconstruction (512 × 512 × 128 = 33.5 million voxels) takes 3 minutes (1 second per iteration) on a 4-core laptop with 16 Gigabytes of RAM.

Hence much like a few folks, it is becoming interesting to see if there is a way to learn to reconstruction solvers (see the recent Learning to Invert: Signal Recovery via Deep Convolutional Networks) so that reconstruction can take seconds or less rather than hours. Here is a new and interesting example of that using GANs.

Compressed Sensing MRI Reconstruction with Cyclic Loss in Generative Adversarial Networks by Tran Minh Quan, Thanh Nguyen-Duc, Won-Ki Jeong

Compressed Sensing MRI (CS-MRI) has provided theoretical foundations upon which the time-consuming MRI acquisition process can be accelerated. However, it primarily relies on iterative numerical solvers which still hinders their adaptation in time-critical applications. In addition, recent advances in deep neural networks have shown their potential in computer vision and image processing, but their adaptation to MRI reconstruction is still in an early stage. In this paper, we propose a novel deep learning-based generative adversarial model, RefineGAN, for fast and accurate CS-MRI reconstruction. The proposed model is a variant of fully-residual convolutional autoencoder and generative adversarial networks (GANs), specifically designed for CS-MRI formulation; it employs deeper generator and discriminator networks with cyclic data consistency loss for faithful interpolation in the given under-sampled k-space data. In addition, our solution leverages a chained network to further enhance the reconstruction quality. RefineGAN is fast and accurate - the reconstruction process is extremely rapid, as low as tens of milliseconds for reconstruction of a 256 x 256 image, because it is one-way deployment on a feedforward network, and the image quality is superior even for extremely low sampling rate (as low as 10%) due to the data-driven nature of the method. We demonstrate that RefineGAN outperforms the state-of-the-art CS-MRI methods by a large margin in terms of both running time and image quality via evaluation using several open-source MRI databases.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, October 12, 2017

PhD Thesis: Efficient Methods and Hardware for Deep Learning by Song Han

Congratulations Dr. Han !

The thesis is listed below but many of themes are part of lecture 15 of Stanford CS 231n's Convolutional Neural Networks for Visual Recognition (Spring 2017 ). Enjoy !

Efficient Methods and Hardware for Deep Learning by Song Han

The future will be populated with intelligent devices that require inexpensive, low-power hardware platforms. Deep neural networks have evolved to be the state-of-the-art technique for machine learning tasks. However, these algorithms are computationally intensive, which makes it difficult to deploy on embedded devices with limited hardware resources and a tight power budget. Since Moore's law and technology scaling are slowing down, technology alone will not address this issue. To solve this problem, we focus on efficient algorithms and domain-specific architectures specially designed for the algorithm. By performing optimizations across the full stack from application through hardware, we improved the efficiency of deep learning through smaller model size, higher prediction accuracy, faster prediction speed, and lower power consumption. Our approach starts by changing the algorithm, using "Deep Compression" that significantly reduces the number of parameters and computation requirements of deep learning models by pruning, trained quantization, and variable length coding. "Deep Compression" can reduce the model size by 18x to 49x without hurting the prediction accuracy. We also discovered that pruning and the sparsity constraint not only applies to model compression but also applies to regularization, and we proposed dense-sparse-dense training (DSD), which can improve the prediction accuracy for a wide range of deep learning models. To efficiently implement "Deep Compression" in hardware, we developed EIE, the "Efficient Inference Engine", a domain-specific hardware accelerator that performs inference directly on the compressed model which significantly saves memory bandwidth. Taking advantage of the compressed model, and being able to deal with the irregular computation pattern efficiently, EIE improves the speed by 13x and energy efficiency by 3,400x over GPU.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, October 11, 2017

Slides and Videos: Deep RL Bootcamp 26-27 August 2017, Berkeley CA

Thanks to Pieter's twitter stream, I just noticed the slides and videos of the Deep RL Bootcamp that took place on 26-27 August 2017 at Berkeley. Enjoy :

Lectures

Core Lecture 1 Intro to MDPs and Exact Solution Methods -- Pieter Abbeel (video | slides)
Core Lecture 2 Sample-based Approximations and Fitted Learning -- Rocky Duan (video | slides)
Core Lecture 3 DQN + Variants -- Vlad Mnih (video | slides)
Core Lecture 4a Policy Gradients and Actor Critic -- Pieter Abbeel (video | slides)
Core Lecture 4b Pong from Pixels -- Andrej Karpathy (video | slides)
Core Lecture 5 Natural Policy Gradients, TRPO, and PPO -- John Schulman (video | slides)
Core Lecture 6 Nuts and Bolts of Deep RL Experimentation -- John Schulman (video | slides)
Core Lecture 7 SVG, DDPG, and Stochastic Computation Graphs -- John Schulman (video | slides)
Core Lecture 8 Derivative-free Methods -- Peter Chen (video | slides)
Core Lecture 9 Model-based RL -- Chelsea Finn (video | slides)
Core Lecture 10a Utilities -- Pieter Abbeel (video | slides)
Core Lecture 10b Inverse RL -- Chelsea Finn (video | slides)
Frontiers Lecture I: Recent Advances, Frontiers and Future of Deep RL -- Vlad Mnih (video | slides)
Frontiers Lecture II: Recent Advances, Frontiers and Future of Deep RL -- Sergey Levine (video | slides)
TAs Research Overviews (video | slides)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, October 10, 2017

DiffuserCam: Lensless Single-exposure 3D Imaging

One word: Awesome !

DiffuserCam: Lensless Single-exposure 3D Imaging by Nick Antipa, Grace Kuo, Reinhard Heckel, Ben Mildenhall, Emrah Bostan, Ren Ng, Laura Waller

We demonstrate a compact and easy-to-build computational camera for single-shot 3D imaging. Our lensless system consists solely of a diffuser placed in front of a standard image sensor. Every point within the volumetric field-of-view projects a unique pseudorandom pattern of caustics on the sensor. By using a physical approximation and simple calibration scheme, we solve the large-scale inverse problem in a computationally efficient way. The caustic patterns enable compressed sensing, which exploits sparsity in the sample to solve for more 3D voxels than pixels on the 2D sensor. Our 3D voxel grid is chosen to match the experimentally measured two-point optical resolution across the field-of-view, resulting in 100 million voxels being reconstructed from a single 1.3 megapixel image. However, the effective resolution varies significantly with scene content. Because this effect is common to a wide range of computational cameras, we provide new theory for analyzing resolution in such systems.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, October 07, 2017

Slides: Université Cote d'Azur/UCA Deep Learning School, May 2017

At the recent France is AI event, Stephane mentioned to me that a Deep Learning School had been organized by (another) Stéphane and Frédéric this past June in Nice. Here are the slides:

Panorama du Deep Learning aujourd'hui : Stéphane Canu (Deep_In_France, INSA-Rouen)
Bases du Deep Learning (Perceptron, MLP, backpropagation, etc) : Frédéric Precioso (Deep_In_France, UCA)
Introduction aux Convolutional Neural Network (CNN) : Frédéric Precioso (Deep_In_France, UCA)
Cours autoencodeurs et Restricted Boltzman Machines : Soufiane Belharbi (Deep_In_France, INSA-Rouen)
Generative Adversarial Networks : Mélanie Ducoffe (Deep_In_France, UCA)
Transfer learning with CNN : Soufiane Belharbi (Deep_In_France, INSA-Rouen)
Cours sur les Recurrent Neural Network: Rémi Cadène (Deep_In_France, LIP6)
Visual Question Answering: Rémi Cadène (Deep_In_France, LIP6)
Attention model pour image captioning: Jakob Verbeek (Deep_In_France, INRIA Rhone-Alpes)
Optimisation Deep Networks: Mélanie Ducoffe/Soufiane Belharbi/Frédéric Precioso (Deep_In_France)
Convolutional neural fabrics: Jakob Verbeek (Deep_In_France, INRIA Rhone-Alpes)
Active learning for Deep Nets: Mélanie Ducoffe (Deep_In_France, UCA)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, October 05, 2017

France is AI, Program for Day One

So today I'll be at the France is AI conference. Here is the program for the day.

THURSDAY OCTOBER 5
8:30: AM - 9:00: AM Breakfast
9:00: AM - 9:05: AM Welcome + Introduction France is AI Keynote Paul Strachman
9:05: AM - 9:10: AM Scikit-Learn: 2017 sprint + Future Roadmap Keynote Alexandre Gramfort (INRIA / Scikit-Learn)
9:10: AM - 9:15: AM Deep in France - initiative Keynote Stéphane Canu (INSA Rouen / Deep in France)
9:15: AM - 9:35: AM How Microsoft is thinking about AI? Random walk along the Microsoft AI Research Agenda Keynote Bernard Ourghanlian (Microsoft)
9:35: AM - 9:55: AM Evolution of Computational Advertising Keynote Suju Rajan (Criteo Labs)
9:55: AM - 10:15: AM On recent advances in machine listening and ambiant sound detection Keynote Alain Rakotomamonjy (University Rouen)
10:15: AM - 10:35: AM Mathematical Mysteries of Deep Neural Networks Keynote Stéphane Mallat (ENS)
10:35: AM - 11:00: AM Networking and Coffee - Morning
11:00: AM - 11:20: AM Deep learning for sequence-to-sequence transformations Keynote Eric Gaussier (LIG Grenoble)
11:20: AM - 11:30: AM Attributes Extraction in e-Commerce Keynote Simona Maggio (Rakuten / PriceMinister)
11:30: AM - 11:50: AM Deep Reinforcement Learning for Prediction under Budgeted Constraints Keynote Ludovic Denoyer (LIP6/UPMC)
11:50: AM - 12:10: PM Visual Question Answering: a new Vision and Langage task Keynote Matthieu Cord (LIP6)
12:10: PM - 12:15: PM Blockchain orchestrated Machine Learning for sensitive data Firepitch Mathieu Galtier (Rythm)
12:15: PM - 12:20: PM Building an AI first product Firepitch David Bessis (Tinyclues)
12:20: PM - 12:25: PM A neural coreference system for conversational agents Firepitch Thomas Wolf (Hugging Face)
12:25: PM - 12:30: PM Detection of phishing webpages with machine learning Firepitch Paolo Pinto (Vade Secure)
12:30: PM - 2:00: PM Break Lunch
2:00: PM - 2:20: PM AI Hardware: the next big conquest? Panel - Julie Josse Julien Demouth (Nvidia), Ludovic Larzul (Mipsology), Igor Carron (LightOn), Damien Querlioz (Universite Paris Sud)
2:20: PM - 2:40: PM AI at Scale @Twitter: ranking the timelines Keynote Nicolas Koumchatzky (Twitter)
2:40: PM - 3:00: PM Embedded AI: Bringing AI on the devices? Panel - Igor Carron Merouane Debbah (Huawei), Thibault de La Fresnaye (Orange), Joseph Dureau (Snips.ai)
3:00: PM - 3:05: PM Predictive AI: The Future of Human to Machine Interactions Firepitch Grégory Grefenstette (Biggerpan)
3:05: PM - 3:10: PM Generating customer responses w/ GANs Firepitch Grégory Renard (xBrain)
3:10: PM - 3:15: PM Scaling up to the full customer lifecycle in financial services Firepitch Nicolas Meric (Dreamquark)
3:15: PM - 3:35: PM Generative Modeling with Optimal Transport Keynote Marco Cuturi (ENSAE)
3:35: PM - 3:55: PM Data challenges with modularization and code submission Keynote Balazs Kegl (CNRS / Paris-Saclay Center for Data Science)
3:55: PM - 4:15: PM Teaching Machines to Learn Autonomously Keynote Emmanuel Dupoux (ENS / EHESS)
4:15: PM - 4:40: PM Networking and Coffee - Afternoon
4:40: PM - 5:00: PM Optimization for machine learning: from single to multiple machines Keynote Francis Bach (INRIA / ENS)
5:00: PM - 5:25: PM Deep Learning: current limits and future perspectives Keynote François Chollet (Google / Keras)
5:25: PM - 5:45: PM The future of AI: discussion on promising approaches Panel - Igor Carron Francis Bach (INRIA / ENS), Olivier Grisel (INRIA / Scikit-Learn), Bernard Ourghanlian (Microsoft)
5:45: PM - 6:05: PM AI: France roadmap (OPECST) Fireside chat - Olivier Ezratty - Igor Carron Cédric Villani (Assemblée Nationale)
6:05: PM - 6:10: PM France is AI Closing Day 1 Keynote Paul Strachman, Igor Carron, Julie Josse
6:10: PM - 6:25: PM Closing Keynote Mounir Mahjoubi (Secrétariat au Numérique)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, October 04, 2017

Nuit Blanche in Review (September 2017)

Much has happened since the last Nuit Blanche in Review (August 2017). This past month, we saw The Demise of Cassini and the Rise of Artificial InteIligence but also an implementation, a hardware implementation, several in-depth preprints, an overview, several videos and a job announcement. Enjoy !

Implementations

Stabilizing GAN Training with Multiple Random Projections - implementation -

Hardware:

CSHardware: InView Multi-Pix Camera Demonstrates 1FPS SWIR Imaging

In-depth:

Overviews

Overviews: Deep Learning on Reinforcement Learning, Music Generation and Recommender Systems

Meetings/Meetups/CfP

Videos:

Jobs:

CSjob: Assistant Professor, High Dimensional Data Analysis, University of Colorado at Boulder

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sunday, October 01, 2017

Sunday Morning Insight: The Demise of Cassini and the Rise of Artificial InteIligence

In the past few weeks, two events connected to the whole Artificial Intelligence narrative occurred: Cassini plunged into Saturn while NIPS conference registrations closed in an, unheard of, record amount of time.

Pretty often the Artificial Intelligence narratives revolve around one factor and then explains away why the field cannot go on because that factor is not new, not good anymore, not whatever...... That sort of narrative was pushed by Tech Review when it mentioned that AI may be plateauing because "Neural Networks" are thirty or more years old. Yes, neural networks have existed for a long time and no AI is not going to be plateauing because it actually hinges on several factors, not one.

This is the story of one of these factors.

It started thanks in large part to Space exploration, and no, we are not talking about the awesome Deep Space 1 spacecraft [1] even though much like that spacecraft, that story also started at JPL.

When Dan Goldin became NASA administrator, he pushed a series of constraints on new space missions that had the whole NASA organisation integrate newer, better technologies faster in the design of less expensive space missions [2]. In fact, Cassini was seen as the mission to avoid in the future. From the story told on the JPL website, under the "Faster, Better Cheaper" mantra, one can read:

Without finding ways to cut costs substantially, JPL faced extinction. The NASA budget would not support enough Cassini-scale missions to keep the lab operating.

The vast majority of cameras in space missions had, until then, used CCD devices. While the technology provided high quality images, it was brittle. For one, it required cooling to get some good signal over noise ratio. That cooling in turn meant that the imagers required more power to operate and could fail more systematically during launch phases. It was also a line based design meaning that you could lose an entire line of pixels at once. In short, it was fragile and more importantly the technology made the sensor heavier, a cardinal sin in Space Exploration.

Then came Eric Fossum. This is what you can read on his Wikipedia entry:

....One of the instrument goals was to miniaturize charge-coupled device (CCD) camera systems onboard interplanetary spacecraft. In response, Fossum invented a new CMOS active pixel sensor (APS) with intra-pixel charge transfer camera-on-a-chip technology, now just called the CMOS Image Sensor or CIS^[5]^[6] (active pixel sensors without intra-pixel charge transfer were described much earlier, by Noble in 1968.^[7] As part of Goldin's directive to transfer space technology to the public sector whenever possible, Fossum led the CMOS APS development and subsequent transfer of the technology to US industry, including Eastman Kodak, AT&T Bell Labs, National Semiconductor and others. Despite initial skepticism by entrenched CCD manufacturers, the CMOS image sensor technology is now used in almost all cell-phone cameras, many medical applications such as capsule endoscopy and dental x-ray systems, scientific imaging, automotive safety systems, DSLR digital cameras and many other applications.

Since CMOS rely on the same process as used in computing chips, it scaled big time and became very cheap. In fact, the very creation of massive image and video collections of datasets hosted by the likes of YouTube then Google, Flickr then Yahoo!, InstaGram then Facebook and most other internet companies, was uniquely enabled by the arrival of CMOS in consumer imaging, first in cameras and then in smartphones:

The size of these datasets enabled the ability to train very large neural networks beyond toy models. New algorithm developments on top of neural networks and large datasets brought error rates down to the point where large internet companies could soon begin to utilize these techniques on the data that had been collected since the early 2000's on their servers.

On September 14th 2017, Cassini was downloading it's last CCD-based images and all the registration at NIPS, one of the most well known ML/DL/AI conference, sold out three months ahead of the meeting: a feat that is unheard of for a specialist's conference. The conference will be held in Long Beach, not far from JPL where, somehow, the sensor that started it all, was born.

Résultat de recherche d'images pour "nips registration"

One more thing, Eric Fossum is building the QIS, the next generation imaging sensor [3] that will produce more pixels.....

Notes.

[1] John Garret story of LISP at JPL

[2] The TRL scale that everyone uses these days, ( and translated for the first time in French was here on Nuit Blanche) was born around that time so that NASA could evaluate what technology could be integrated faster into space missions.

[3] Check our discussion on QIS and compressive sensing.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday Morning Videos: Cognitive Computational Neuroscience (CCN) 2017

Thanks to Dustin and Sri for mentioning these videos existed !

Videos from Cognitive Computational Neuroscience (CCN) 2017

Wednesday, September 6, 2017

Welcome

Welcome: Why are we meeting?
Thomas Naselaris, Medical University of South Carolina (MUSC)

Kavli Opening Keynote Session: Explaining Cognition, Brain Computation, and Intelligent Behavior

Cognitive Science:

The cognitive science perspective: Reverse-engineering the mind
Josh Tenenbaum, Massachusetts Institute of Technology

Computational Neuroscience:
Nicole Rust, University of Pennsylvania

Artificial Intelligence:

How does the brain learn so much so quickly?
Yann LeCun, Facebook, New York University

Panel Discussion:

Nancy Kanwisher, Massachusetts Institute of Technology
Josh Tenenbaum, Massachusetts Institute of Technology
Nicole Rust, University of Pennsylvania
Yann LeCun, Facebook, New York University
Bruno Olshausen, UC-Berkeley
Jackie Gottlieb, Columbia University
Moderator: Jim DiCarlo, Massachusetts Institute of Technology

Contributed Talks: Sensation & Perception

Neural networks for efficient bayesian decoding of natural images from retinal neurons
Eleanor Batty, Columbia University

A reverse correlation test of computational models of lightness perception
Richard Murray, York University

Keynote 1

I have not decided yet
Michael N. Shadlen, Columbia University

Contributed Talks: Attention & Judgement

Understanding biological visual attention using convolutional neural networks
Grace Lindsay, Columbia University

A dichotomy of visual relations or the limits of convolutional neural networks
Matthew Ricci & Junkyung Kim, Brown University

Computation of human confidence reports in decision-making with multiple alternatives
Hsin-Hung Li, New York University

Thursday, September 7, 2017

Tutorial talks

Cognitive Science:

Modeling of behavior
Wei Ji Ma, New York University

Computational Neuroscience:

Tutorial on computational neuroscience
Ruben Coen-Cagli, Albert Einstein College of Medicine

Artificial Intelligence:

Artificial Intelligence with connections to neuroimaging
Alona Fyshe, University of Victoria

Contributed Talks: Memorability & predictive coding

Unconscious perception of scenes reveals a perceptual neural signature of memorability
Yalda Mohsenzadeh, Massachusetts Institute of Technology

Predictive coding & neural communication delays produce alpha-band oscillatory impulse response functions
Rufin VanRullen, Université Toulouse

Keynote 2

How we understand others’ emotions
Rebecca Saxe, Massachusetts Institute of Technology

Contributed Talks: Localization & task learning

Emergence of grid-like representations by training recurrent neural networks to perform spatial localization
Chris Cueva, Columbia University

Modular task learning in an embodied two-dimensional visual environment
Kevin T. Feigelis, Stanford University

Distributed mechanisms supporting information search and value-based choice in prefrontal cortex
Laurence Hunt, University of Oxford

Friday, September 8, 2017

Keynote 3

Probabilistic models of sensorimotor control
Daniel Wolpert, University of Cambridge

Contributed Talks: Reinforcement learning & control

Surprise, surprise: Cholinergic and dopaminergic neurons encode complementary forms of reward prediction errors
Fitz Sturgill, Cold Spring Harbor Laboratory

Hippocampal pattern separation supports reinforcement learning
Ian Ballard, Stanford University

Offline replay supports planning: fMRI evidence from reward revaluation
Ida Momennejad, Princeton University

Keynote 4

How we learn and represent the structure of tasks
Yael Niv, Princeton University

Contributed Talks: Exploration & exploitation

Amygdala drives value and exploration signals in striatum and orbitofrontal Cortex
Vincent Costa, National Institute of Mental Health

History effect in a minimalistic explore-exploit task
Mingyu Song, Princeton University

Keynote 5

Strategic decision-making in the human subcortex measured with UHF-MRI
Birte Forstmann, University of Amsterdam

Contributed Talks: Learning in deep neural networks

Deep learning with segregated dendrites
Blake Aaron Richards, University of Toronto Scarborough

When do neural networks learn sequential solution in short-term memory tasks?
Emin Orhan, New York University

Closing Keynote Debate: What is the best level to model the mind-brain?

Perspective 1: Cognitive Models:

Bridging the computational and algorithmic levels
Tom Griffiths, UC-Berkeley

Perspective 2: Deep Learning and the Cognitive Brain:

Deep learning and backprop in the brain
Yoshua Bengio, Université de Montréal

Panel Discussion:

Tom Griffiths, UC-Berkeley
Yoshua Bengio, Université de Montréal
Anne Churchland, Cold Spring Harbor Laboratory
Aude Oliva, Massachusetts Institute of Technology
Tony Movshon, New York University
Moderator:
Jonathan Cohen, Princeton University

Closing

Closing Remarks
Nikolaus Kriegeskorte, Columbia University