## Page Views on Nuit Blanche since July 2010

My papers on ArXiv:
Approximating Kernels at the speed of Light
&
Imaging with Nature

LightOn
LinkedIn (557)|| on CrunchBase || our Blog
(2403)
(3984)
(13335)||
Attendant references pages:
The Advanced Matrix Factorization Jungle Page ||

Paris Machine Learning
@Meetup.com (7800 members) || @Archives

## Saturday, March 31, 2018

### Saturday Morning Video: On Characterizing the Capacity of Neural Networks using Algebraic Topology by William Guss

Here is a video from Microsoft Research by William Guss. I love it because it seems to connect to some of the earlier work we have seen in compressive sensing and related field. ( recently here, or earlier here  or here)

Much like one of the commenter on YouTube, I would have loved less questions during the presentation but it is a fascinating subject. From Guss' website:

The learnability of different neural architectures can be characterized directly by computable measures of data complexity. In this paper, we reframe the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data, beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we show that the power of a network to express the topological complexity of a dataset in its decision region is a strictly limiting factor in its ability to generalize. We then provide the first empirical characterization of the topological capacity of neural networks. Our empirical analysis shows that at every level of dataset complexity, neural networks exhibit topological phase transitions. This observation allowed us to connect existing theory to empirically driven conjectures on the choice of architectures for fully-connected neural networks.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Friday, March 30, 2018

### CfP: OIP'21 : Optics for information processing in the 21th century, 23-26 May 2018 Florence (Italy)

Sylvain let me know of the CfP for this meeting "OIP'21 : Optics for information processing in the 21th century" which will take place in Florence (Italy) on 23-26 May 2018.

Scope of the conference
This very focused workshop aims at bringing together key players in the field to discuss the recent advances in the field of :
• optical computing
• computational imaging
• optical information processing,
• future challenges and opportunities.
It aims at connecting the communities of optics, physics, mathematics and computer science, in an open atmosphere, with long time for lectures and extended discussions, favoring the exchange of ideas and emergence of possible collaborations.
Confirmed Invited Speakers
• Alexander Gaeta (Columbia)
• Alexander Jesacher (Medical U. Innsbruck)
• Anne Sentenac (CNRS – Institut Fresnel)
• Christophe Moser (EPFL)
• David Miller (Stanford)
• Demetri Psaltis (EPFL)
• Florent Krzakala  (ENS Paris)
• Georges Barbastathis (MIT)
• Jason Fleischer (Princeton)
• Michal Lipson (Columbia)
• Monika Ritsch-Marte (Medical U. Innsbruck)
• Ori Katz (Hebrew University)
• Riccardo Sapienza (Imperial College)
• Shaya Fainman (U.C. San Diego)
Registration
Beyond the invited talks, there is a limited number of slots for Contributed posters, preferentially  for  PhD and Postdoctoral researcher. Accomodation in shared room will be provided at the Villa for students and postdocs (on a first come first serve basis). participation to the conference with lodging on your own is of course also possible.
Registration costs:
• full registration single room *invited only* (conference fees, lodging 3 nights in single room,  meals) : 600 euros
• full registration shared room (conference fees, lodging 3 nights in double or triple room,  meals) : 500 euros
• registration without lodging (conference fees, meals) : 300 euros
• accompanying persons (meals) : 125 euros
Registration is open (payment will be available shortly).
The Venue
The Villa Finaly is situated on the heights above Florence, in beautiful Tuscany, Italy. The villa belongs to the "chancellerie des universités de Paris" which devotes it to organize scientific conference and events. The conference as well as most meals will be taken as the villa, which also counts several rooms to host most of the participants to the conference. Additional participants can easily find accomodation in the numerous hotels of downtown florence, and easily reach the villa by bus.
see the official website of the Villa Finaly
Organizers
Sylvain Gigan - Professor Sorbonne Université Paris
email: sylvain.gigan@lkb.ens.fr
Rafael Piestun - Professor, University of Colorado, Boulder, USA

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, March 29, 2018

### L'intelligence Artificielle au service de l'humain / AI for Humanity

English summary: Cedric Villani, a Field medalist and now a member of the parliament, is issuing the report of his commission on AI today. The AI for humanity event around it, is taking place at College de France. The President of France will also describe the French strategy when it comes to AI this afternoon. The Villani commission report is here.

Several organizers of meetups around AI, including those of the Paris Machine Learning meetup,  were invited to contribute to this discussion.

Cedric Villani, le médaillé Field et parlementaire, rend public aujourd'hui le rapport de la commision sur l'IA. L'événement "L'intelligence Artificielle au service de l'humain" se passe au collège de France. Le président Macron  présentera la stratégie française sur l’intelligence artificielle cette après-midi.

Le streaming est ici en Français

The streaming is here in English:

Le rapport Villani est ici.

## Thursday, March 22, 2018

### Around the blogs in 78 hours (and 7 million page views on Nuit Blanche )

I haven't gone around reading the blogs on the interweb in a while, here is the good stuff I have been missing

Ben
1. Make It Happen. Reinforcement Learning as prescriptive analytics.
2. Total Control. Reinforcement Learning as Optimal Control.
3. The Linearization Principle. If a machine learning algorithm does crazy things when restricted to linear models, it’s going to do crazy things on complex nonlinear models too.
4. The Linear Quadratic Regulator. A quick intro to LQR as why it is a great baseline for benchmarking Reinforcement Learning.
5. A Game of Chance to You to Him Is One of Real Skill. Laying out the rules of the RL Game and comparing to Iterative Learning Control.
6. The Policy of Truth. Policy Gradient is a Gradient Free Optimization Method.
7. A Model, You Know What I Mean? Nominal control and the power of models.
9. Clues for Which I Search and Choose. Simple methods solve apparently complex RL benchmarks.
Dirk
Sanjeev et al
Djalil
Sebastien
Otoro

Fabian
On Nuit Blanche, we passed the 7 million page views "milestone", woohoo !

## Wednesday, March 21, 2018

### Stabilizing Embedology: Geometry-Preserving Delay-Coordinate Maps

Chris just sent me the following:

Hi Igor-
I hope you are well. I wanted to alert you that our paper on delay-coordinate maps and Takens' embeddings has finally appeared.
Eftekhari, Armin, Han Lun Yap, Michael B. Wakin, and Christopher J. Rozell. "Stabilizing embedology: Geometry-preserving delay-coordinate maps." Physical Review E 97, no. 2 (2018): 022222.
http://dx.doi.org/10.1103/PhysRevE.97.022222
preprint:
http://arxiv.org/pdf/1609.06347
You had mentioned a much earlier preliminary result on your blog but this is the full and final result. It uses the tools familiar to this community (random measurements, stable embeddings) to address a fundamental observability result about nonlinear (perhaps even chaotic) dynamical systems from the physics community. The key question is "how much information is there in a time series measurement about the dynamical system that created it?". I think this result is a unique convergence of different fields, and our previous results analyzing recurrent neural networks were a distinct outgrowth of working on this problem.
regards,
chris
Thanks Chris for the update !

Delay-coordinate mapping is an effective and widely used technique for reconstructing and analyzing the dynamics of a nonlinear system based on time-series outputs. The efficacy of delay-coordinate mapping has long been supported by Takens' embedding theorem, which guarantees that delay-coordinate maps use the time-series output to provide a reconstruction of the hidden state space that is a one-to-one embedding of the system's attractor. While this topological guarantee ensures that distinct points in the reconstruction correspond to distinct points in the original state space, it does not characterize the quality of this embedding or illuminate how the specific parameters affect the reconstruction. In this paper, we extend Takens' result by establishing conditions under which delay-coordinate mapping is guaranteed to provide a stable embedding of a system's attractor. Beyond only preserving the attractor topology, a stable embedding preserves the attractor geometry by ensuring that distances between points in the state space are approximately preserved. In particular, we find that delay-coordinate mapping stably embeds an attractor of a dynamical system if the stable rank of the system is large enough to be proportional to the dimension of the attractor. The stable rank reflects the relation between the sampling interval and the number of delays in delay-coordinate mapping. Our theoretical findings give guidance to choosing system parameters, echoing the trade-off between irrelevancy and redundancy that has been heuristically investigated in the literature. Our initial result is stated for attractors that are smooth submanifolds of Euclidean space, with extensions provided for the case of strange attractors.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Tuesday, March 20, 2018

### Sparse Representations and Compressed Sensing Workshop, March 23rd 2018, Inria Paris,

Mark just sent me the following the other day:

Dear Igor,
We thought that readers of Nuit Blanche may be interested in this free one-day workshop in Sparse Representations and Compressed Sensing, being held in Paris next week. There is also an opportunity for PhD students and Early Career Researchers to bring a poster (more below). Best wishes,
Mark
Sure Mark. Here is the announcement:

================================================================

Sparse Representations and Compressed Sensing Workshop

Inria Paris
2 Rue Simone IFF, 75012 Paris, France
================================================================

This one-day workshop will bring together researchers working in the area of sparse representations and compressed sensing to find out about the latest developments in theory and applications of these approaches, and to explore directions for future research.

The concept of sparse representations deals with systems of linear equations where only a small number of the coefficients are non-zero. The technique of compressed sensing aims to efficiently sense and reconstruct a signal from few measurements, typically by exploiting the sparse structure of the underlying representation. These techniques have proved very popular over the last decade or so, with new theoretical developments, and successful applications in areas such as hyperspectral imaging, brain imaging, audio signal processing and graph signal processing.

This one-day workshop, organized by the SpaRTaN and MacSeNet Initial/Innovative Training Networks*, will include invited keynote talks by Karin Schnass (Universität Innsbruck, Austria) and Jean-Luc Starck (CEA-Saclay, France), oral presentations and posters. The talks and posters will include theoretical advances in sparse representations, dictionary learning and compressed sensing, as well as advances in areas such as brain imaging and MRI, hyperspectral imaging, audio and visual signal processing, inverse imaging problems, and graph-structured signals.

PhD students and Early Career Researchers wishing to bring along a poster of their work for the poster session are encouraged to contact macsenet@surrey.ac.uk with a brief abstract of their work. Posters do not need to be novel: this is an opportunity to showcase work and discuss it with others in the field. There will be an opportunity for discussions to continue after the end of the formal workshop.

* European Union's Seventh Framework Programme (FP7-PEOPLE-2013-ITN) under grant agreement n° 607290 SpaRTaN and H2020 Framework Programme (H2020-MSCA-ITN-2014) under grant agreement n° 642685 MacSeNet

--
Prof Mark D Plumbley
Professor of Signal Processing
Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK
Email: m.plumbley@surrey.ac.uk

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, March 19, 2018

### Institute for Advanced Study - Princeton University Joint Symposium on "The Mathematical Theory of Deep Neural Networks", Tuesday March 20th,

Adam just sent me the following:
Hi Igor,

I'm a long-time reader of your blog from back in the day when compressed sensing was still up-and-coming. I wanted to bring to your attention a workshop a few of my fellow post-docs at Princeton and I are hosting this Tuesday at the Princeton Neuroscience Institute: The "Institute for Advanced Study - Princeton University Joint Symposium on 'The Mathematical Theory of Deep Neural Networks'". I thought that this symposium would be of interest to both yourself and your readers. Since space is limited, we are going to be live-streaming the talks online (and will post videos once the dust settles). The link to the live-stream is available on the symposium website:

Cheers!

----------------------------
Post-doctoral associate
Princeton Neuroscience Institute
Princeton, NJ, 08550
Awesome, Adam ! I love the streaming bit. Here is the announcement and the program

Institute for Advanced Study - Princeton University Joint Symposium on "The Mathematical Theory of Deep Neural Networks"
Tuesday March 20th
PNI  Lecture Hall A32
This event will be live-streamed at: https://mediacentrallive.princeton.edu. Additionally, video recordings of the talks will be posted after the event.
Registration is now open: register here.
Recent advances in deep networks, combined with open, easily-accessible implementations, have moved the fields empirical results far faster than formal understanding. The lack of rigorous analysis for these techniques limits their use in addressing scientific questions in the physical and biological sciences, and prevents systematic design of the next generation of networks. Recently, long-past-due theoretical results have begun to emerge. These results, and those that will follow in their wake, will begin to shed light on the properties of large, adaptive, distributed learning architectures, and stand to revolutionize how computer science and neuroscience understand these systems.
This intensive one-day technical workshop will focus on state of the art theoretical understanding of deep learning. We aim to bring together researchers from the Princeton Neuroscience Institute (PNI) and Center for Statistics and Machine Learning (CSML) at Princeton University and of the theoretical machine learning group at the Institute for Advanced Studies (IAS) interested in more rigorously understanding deep networks to foster increased discussion and collaboration across these intrinsically related groups.

 10:00-10:15: Adam Charles (PNI) "Introductory remarks"

 10:15-11:15: Sanjeev Arora (IAS) "Why do deep nets generalize, that is, predict well on unseen data?" 11:15-12:15: Sebastian Musslick (PNI) "Multitasking Capability Versus Learning Efficiency in Neural Network Architectures" 12:15-01:30: Lunch 01:30-02:30: Joan Bruna (NYU) "On the Optimization Landscape of Neural Networks" 02:30-03:30: Andrew Saxe (Harvard) "A theory of deep learning dynamics: Insights from the linear case" 03:30-04:00: Break 04:00-05:00: Anna Gilbert (U Mich) "Towards Understanding the Invertibility of Convolutional Neural Network" 05:00-06:00: (IAS) "Expressiveness of Convolutional Networks via Hierarchical Tensor Decompositions" 06:00-06:15: Michael Shvartsman and Ahmed El Hady (PNI) "Outgoing remarks" 06:15- 8:00: Reception

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, March 16, 2018

### Gradients explode - Deep Networks are shallow - ResNet explained

So last night at the Paris Machine Learning meetup, we had the good folks from Snips making an announcement on the release/open sourcing of their Natural language Understanding code. Joseph also mentioned that after many architectures search, a simple CRF model, a single layer model, did as well as other commercial models. It's NLP so the representability issue has already been parsed. In a different corner of the galaxy, the following paper seems to suggest that ResNets, while rendering these deep networks effectively shallower, do not solve the gradient explosion problem.

Abstract: Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities solve'' the exploding gradient problem, we show that this is not the case and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice. We explain why exploding gradients occur and highlight the {\it collapsing domain problem}, which can arise in architectures that avoid exploding gradients. ResNets have significantly lower gradients and thus can circumvent the exploding gradient problem, enabling the effective training of much deeper networks, which we show is a consequence of a surprising mathematical property. By noticing that {\it any neural network is a residual network}, we devise the {\it residual trick}, which reveals that introducing skip connections simplifies the network mathematically, and that this simplicity may be the major cause for their success.
TL;DR: We show that in contrast to popular wisdom, the exploding gradient problem has not been solved and that it limits the depth to which MLPs can be effectively trained. We show why gradients explode and how ResNet handles them.

In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks.

Deep Residual Learning for Image Recognition by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.
The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Wednesday, March 14, 2018

### Paris Machine Learning Meetup #7 Season 5, Natural Language Understanding (NLU), AI for HR, decentralized AI

Tonight we will be hosted by Urban Linker ! The video of the streaming is here and presentation slides will be available here as well before the meetup. Stay tuned.

Program:

Joseph Dureau, Snips NLU (http://snips.ai), an Open Source, Private by Design alternative to cloud-based solutions

As part of its mission to expand the use of privacy-preserving AI solutions, the Snips team has decided to fully open source its solution for Natural Language Understanding. Snips NLU is an alternative to all cloud-based NLU solutions powering chatbots or voice assistants: Dialogflow, Luis.ai, Recast, Amazon Lex, Wit.ai, Watson, etc. You can run it on the edge or on premises, thus avoiding giving away your user data to a third party service.

Erik Mathiesen, Octavia.ai (https://octavia.ai), An AI Careers Advisor: Using Machine Learning to Predict Your Career Path

Octavia.ai specializes in smart solutions for recruitment. In this talk, I will describe how we use AI, and in particular Neural Networks and Deep Learning, to analyse and predict people’s career paths. Having analysed millions of CVs, our system can predict from a person’s CV what jobs are most likely to be next in the career path of that individual, as well as when the next job move is mostly likely to happen. By doing this, we enable companies to predict and find better candidates as well as forecast future hiring needs within an organisation. I will outline the technologies and techniques used in this application and give a few illustrative example of its usage.

An open-source community focused on building technology to facilitate the decentralized ownership of data and intelligence.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Monday, March 12, 2018

### Random projections in gravitational wave searches of compact binaries

Randomized Matrix factorization and gravitational waves, this is cool !

Random projection (RP) is a powerful dimension reduction technique widely used in analysis of high dimensional data. We demonstrate how this technique can be used to improve the computational efficiency of gravitational wave searches from compact binaries of neutron stars or black holes. Improvements in low-frequency response and bandwidth due to detector hardware upgrades pose a data analysis challenge in the advanced LIGO era as they result in increased redundancy in template databases and longer templates due to higher number of signal cycles in band. The RP-based methods presented here address both these issues within the same broad framework. We first use RP for an efficient, singular value decomposition inspired template matrix factorization and develop a geometric intuition for why this approach works. We then use RP to calculate approximate time-domain correlations in a lower dimensional vector space. For searches over parameters corresponding to non-spinning binaries with a neutron star and a black hole, a combination of the two methods can reduce the total on-line computational cost by an order of magnitude over a nominal baseline. This can, in turn, help free-up computational resources needed to go beyond current spin-aligned searches to more complex ones involving generically spinning waveforms.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Saturday, March 10, 2018

### Saturday Morning Videos: NIPS2017 Meta Learning Symposium videos

Pieter Abbeel mentioned that the #nips2017 Meta Learning Symposium videos are now available here.

Thanks to Risto Miikkulainen, Quoc Le, Kenneth Stanley, and Chrisantha Fernando for organizing and getting the videos online !

Opening remarks, Quoc Le (slides, video)

Topic I: Evolutionary Optimization

• Evolving Multitask Neural Network Structure, Risto Miikkulainen (slides, video)
• Evolving to Learn through Synaptic Plasticity, Ken Stanley (slides, video)
• PathNet and Beyond, Chrisantha Fernando (slides, video)
Topic II: Bayesian Optimization

• Bayesian Optimization for Automated Model Selection, Roman Garnett (slides, video)
• Automatic Machine Learning (AutoML) and How To Speed It Up, Frank Hutter (slides, video)

• Contrasting Model- and Optimization-based Metalearning, Oriol Vinyals (slides, video)
• Population-based Training for Neural Network Meta-Optimization, Max Jaderberg (slides, video)
• Learning to Learn for Robotic Control, Pieter Abbeel (slides, video)
• On Learning How to Learn Learning Strategies, Juergen Schmidhuber (slides, video)

Topic IV: Reinforcement Learning

• Intrinsically Motivated Reinforcement Learning, Satinder Singh (video)
• Self-Play, Ilya Sutskever (slides, video)
• Neural Architecture Search, Quoc Le (slides, video)
• Multiple scales of reward and task learning, Jane Wang (slides, video)

Panel discussion, Moderator: Risto Miikkulainen, Panelists: Frank Hutter, Juergen Schmidhuber, Ken Stanley, Ilya Sutskever (video)

Credit photn: NASA, Starshine 2 , more on project Starshine.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, March 06, 2018

### Randomness in Deconvolutional Networks for Visual Representation

So random weight networks seem to have better generalization properties, uh.

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture. For the random representations of an untrained CNN, we train the corresponding DCN to reconstruct the input images. Compared with the image inversion on pre-trained CNN, our training converges faster and the yielding network exhibits higher quality for image reconstruction. It indicates there is rich information encoded in the random features; the pre-trained CNN may discard information irrelevant for classification and encode relevant features in a way favorable for classification but harder for reconstruction. We further explore the property of the overall random CNN-DCN architecture. Surprisingly, images can be inverted with satisfactory quality. Extensive empirical evidence as well as theoretical analysis are provided.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !