Tuesday, April 16, 2019

Videos: Foundations of Data Science Boot Camp workshop, Aug. 27 – Aug. 31, 2018 (Simon Institute and Kavli Foundation)




Here are the videos and some slides. 

These two lectures recap some basics of Data Science. Topics will include: high-dimensional geometry, concentration inequalities, Gaussian densities and mixtures, Singular Value Decomposition (SVD), Applications of SVD, Markov Chains, Rapid Mixing, Streaming, Randomized Algorithms for Matrices etc. For those familiar with these topics, different proofs of basic theorems (than the usual ones) may be of interest.

In this tutorial, I'll give an overview of near optimal algorithms for regression, low rank approximation, and a variety of other problems. The results are based on the sketch and solve paradigm, which is a tool for quickly compressing a problem to a smaller version of itself, for which one can then run a slow algorithm on the smaller problem. These lead to the fastest known algorithms for fundamental machine learning and numerical linear algebra problems, which run in time proportional to the number of non-zero entries of the input.
Following on earlier lectures, I will discuss two additional ways to effectively sketch matrices for a variety of applications: sampling based on leverage scores, and the Subsampled Randomized Hadamard Transform.

Stochastic Gradient Descent (SGD) is the basic first-order stochastic optimization algorithm behind powerful deep learning architectures that are becoming increasingly omnipresent in society. In this lecture, we motivate the use of stochastic first-order methods and recall some convergence results for SGD. We then discuss the notion of importance sampling for SGD and how it can improve the convergence rate. Finally, we discuss methods for making SGD more "robust" to hyper-parameters of the algorithm, such as the step size, using "on the fly" adaptive step size methods such as AdaGrad, and present some theoretical results.

Sampling methods have long been ubiquitous in data science and machine learning. Recently, due to their complementary algorithmic and statistical properties, sampling and related sketching methods are central to randomized linear algebra and stochastic optimization. We'll provide an overview of structural properties central to key results in randomized linear algebra, highlighting how sampling and sketching methods can lead to improved results. This is typically achieved in quite different ways, depending on whether one is interested in worst-case linear algebra theory bounds, or whether one is using linear algebra for numerical implementations, statistical bounds, machine learning applications, uses in iterative stochastic optimization algorithms, etc. We'll provide an overview of how sampling and sketching methods are used in randomized linear algebra in each of these different cases, highlighting similarities and differences.
Sampling for Linear Algebra, Statistics, and Optimization I
Sampling for Linear Algebra, Statistics, and Optimization II


Contrary to the scientific computing community which has, wholeheartedly, embraced the second-order optimization algorithms, the machine learning (ML) community has long nurtured a distaste for such methods, in favour of first-order alternatives. When implemented naively, however, second-order methods are clearly not computationally competitive. This, in turn, has unfortunately lead to the conventional wisdom that these methods are not appropriate for large-scale ML applications. In this series of talks, we will provide an overview of various second-order optimization methods and their stochastic variants. We will demonstrate the theoretical properties as well as empirical performance of a variety of efficient Newton-type algorithms for both convex and non-convex problems. In the process, we will highlight the disadvantages of first-order methods and, in their light, showcase the practical advantages offered by appropriate application of second-order information.
 Stochastic Second Order Optimization Methods I 
Stochastic Second Order Optimization Methods II

I will cover estimation, hypothesis testing, and confidence intervals from a frequentist perspective, and Bayesian statistical inference. Topics in classical asymptotics including consistency, maximum likelihood estimation, asymptotic tests and confidence intervals. No statistics background is assumed.

In this tutorial, we'll discuss some phenomena in high dimension, including the distribution of mass (Brunn-Minkowski, Grunbaum), log-concavity (Prékopa-Leindler), extremal properties (Dvoretzky), concentration (Lévy, CLT) and isoperimetry (Poincaré, KLS).

Fitting a model to a collection of observations is one of the quintessential problems in statistics. The typical assumption is that the data was generated by a model of a given type (e.g., a mixture model). This is a simplifying assumption that is only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted data. Until recently, even for the most basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. A recent line of work in theoretical computer science obtained the first computationally efficient robust estimators for a range of high-dimensional estimation tasks. In this tutorial talk, we will survey the algorithmic techniques underlying these estimators and the connections between them. We will illustrate these techniques for the problems of robust mean and covariance estimation. Finally, we will discuss new directions and opportunities for future work.
 Algorithmic High Dimensional Robust Statistics I 
Algorithmic High Dimensional Robust Statistics II


The nearest neighbor search (NNS) problem is defined as follows: Given a set P of n points in some metric space (X, D), build a data structure that, given any point q, returns a point in P that is (approximately) closest to q. In this tutorial, I will survey classic and more recent NNS data structures that are designed for the "high-dimensional" regime. In the first half, I talk about the current state of affairs for NNS over the l_1 and l_2 distances (in particular, Locality-Sensitive Hashing (LSH) and its data-dependent counterpart), while in the second half, I will focus on NNS for non-Euclidean geometries (including some of the very recent developments, such as spectral partitioning for metric spaces).

As the sizes of modern datasets grow, many classical algorithms become prohibitively expensive, as the input is often too large to be stored in the memory of a single compute node. Streaming algorithms are designed to handle massive inputs using limited space: they process a stream of data items 'on the fly' while maintaining a small memory footprint at all times. In this talk I will discuss classical techniques for solving basic statistical estimation problems (e.g., moment estimation, distinct elements) on data streams, as well as recent results on graph analysis in the streaming model.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, April 15, 2019

CfP: 2019 Conference on Mathematical Theory of Deep Neural Networks (DeepMath 2019)


Like last yearAdam sent me the following several days ago:
Dear Igor,

I hope all is well with you. I was just contacting you to let you know about the announcement for the 2019 Conference on Mathematical Theory of Deep Neural Networks (DeepMath 2019). We're very excited for this event and I think it would be of interest to yourself and to many of your readers/followers. It would be great if you could propagate the announcement below so that we can really bring together researchers working on this important topic!

Warmest regards,

-Adam

-----------------------
Adam S. Charles
Post-doctoral Associate
Princeton Neuroscience Institute
Princeton, NJ, 08544
The videos of the previous installment were featured here. Here is the announcement for this years':

-------------------------------------
ANNOUNCEMENT AND CALL FOR CONTRIBUTIONS

The 2019 Conference on Mathematical Theory of Deep Neural Networks (DeepMath 2019) Princeton Club, New York City, Oct 31-Nov 1 2019. Web: https://www.deepmath-conference.com/

======= Important Dates =======
Submission deadline for 1-page abstracts: June 28, 2019
Notification: TBA.
Conference: Oct 31-Nov 1 2019.

======= Call for abstracts =======
In addition to these high-profile invited speakers, we invite 1-page non-archival abstract submissions. Abstracts will be reviewed double-blind and presented as posters.

To complement the wealth of conferences focused on applications, all submissions for DeepMath 2019 must target theoretical and mechanistic understanding of the underlying properties of neural networks.
Insights may come from any discipline and we encourage submissions from researchers working in computer science, engineering, mathematics, neuroscience, physics, psychology, statistics, or related fields.

Topics may address any area of deep learning theory, including architectures, computation, expressivity, generalization, optimization, representations, and may apply to any or all network types including fully connected, recurrent, convolutional, randomly connected, or other network topologies.

======= Confirmed speakers =======
Anima Anandkumar (CalTech), Yasaman Bahri (Google), Minmin Chen (Google), Michael Elad (Technion), Surya Ganguli (Stanford), Tomaso Poggio (MIT), David Schwab (CUNY), Shai Shalev-Shwartz (Hebrew University), Haim Sompolinsky (Hebrew University and Harvard), and Naftali Tishby (Hebrew University).

======= Workshop topic =======
Recent advances in deep neural networks (DNNs), combined with open,  easily-accessible implementations, have made DNNs a powerful, versatile method used widely in both machine learning and neuroscience. These advances in practical results, however, have far outpaced a formal understanding of these networks and their training. Recently, long-past-due theoretical results have begun to emerge, shedding light on the properties of large, adaptive, distributed learning architectures.

Following the success of the 2018 IAS-Princeton joint symposium on the same topic (https://sites.google.com/site/princetondeepmath/home), the 2019 meeting is more centrally located and broader in scope, but remains focused on rigorous theoretical understanding of deep neural networks. 



Join the CompressiveSensing subreddit or the Facebook page and post there !

Saturday, April 13, 2019

Videos: Variational Methods and Optimization in Imaging, at IHP Paris, February 4th – 8th , 2019



Within this semester of «The Mathematics of Imaging» at IHP (from January 7th till April 5th , 2019), the first set of talks took place within the «Variational Methods and Optimization in Imaging» conference from February 4th till 8th, 2019 in Paris. Here the videos of these talks, enjoy !  
1

Stable Models and Algorithms for Backward Diffusion Evolutions - Joachim Weickert - Workshop 1-CEB T1 2019

2
Coupling variational method with CNN for image colorization - Fabien Pierre - Workshop 1 - CEB T1 2019

3
Finite element discretizations of the total variation - Antonin Chambolle - Workshop 1 - CEB T1 2019

4
A geometric integration approach to non-smooth and non-convex optimization - Erlend Schoenlieb Riis - Workshop 1 - CEB T1 2019

5
Characterizing functionals and flows by nonlinear eigenvalue analysis - Guy Gilboa - Workshop 1 - CEB T1 2019

6
Deep Unfolding of a Proximal Interior Point Method for Image Restoration - Emilie Chouzenoux - Workshop 1 - CEB T1 2019

7
Covariant LEAst-square Re-fitting for Image Restoration - Nicolas Papadakis - Workshop 1 - CEB T1 2019

8
Addressing Generative Adversarial Network (GAN) limitations: Resolution, lack of of Novelty and control on generations  - Camille Couprie - Workshop 1 - CEB T1 2019

9
Learning to Solve Inverse Problems in Imaging - Rebecca Willet - Workshop 1 - CEB T1 2019

10
An atomic norm perspective on total variation regularization in Image Processing - Vincent Duval - Workshop 1 - CEB T1 2019

11
On support localisation, the Fisher metric and optimal sampling in off-the-grid sparse regularization - Clarice Poon - Workshop 1 - CEB T1 2019

12
Exact rate of Nesterov Scheme - Charles Dossal - Workshop 1 - CEB T1 2019

13
A varifold approach to surface approximation and curvature estimation on point clouds - Blanche Buet - Workshop 1 - CEB T1 2019

14
Infimal-convolution-type regularization for inverse problems .. - Kristian Bredies - Workshop 1 - CEB T1 2019

15
From the modelization of direct problems in image processing to the resolution of inverse problems - Caroline Chaux - Workshop 1 - CEB T1 2019

16
Sliced Partial Optimal Transport - Nicolas Bonneel - Workshop 1 - CEB T1 2019

17
Metamorphosis on generalized image manifolds - Martin Rumpf - Workshop 1 - CEB T1 2019




18
Quadratically regularized optimal transport - Dirk Lorenz - Workshop 1 - CEB T1 2019

19
Blind Ptychography: Theory and Algorithm - Albert Fannjiang - Workshop 1 - CEB T1 2019

20
Spectral properties of steplength selections in gradient methods from unconstrained to constrained optimization - Luca Zanni - Workshop 1 - CEB T1 2019

21
Discrete multigrid convergent estimators of curvature - Hughes Talbot - Workshop 1 - CEB T1 2019

22
The Shannon Total Variation - Rémy Abergel - Workshop 1 - CEB T1 2019

23
L'évolution de la pensée mathématique sur les images digitales et ses implications pratiques et visibles dans nos photographies - Jean Michel Morel - Workshop 1 - CEB T1 2019

24
On computational barriers in mathematics of information and instabilities in deep learning for inverse problems - Anders Hansen - Workshop 1 - CEB T1 2019

25
Presentation of the trimester "The mathematics of imaging" in the IHP

26

Photography Made Easy - Sylvain Paris - Workshop 2 - CEB T1 2019










Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, April 12, 2019

Data-Driven Design for Fourier Ptychographic Microscopy

One of the things that has changed in the past two years is the interest of the community to build reconstruction solvers using Deep Neural Networks in what we used to call here Data Driven Sensor Design. Here is a new example below. Fascinating enough the demand for theoretical understanding that was asked in Ptychography seems to have vanished :-)





Fourier Ptychographic Microscopy (FPM) is a computational imaging method that is able to super-resolve features beyond the diffraction-limit set by the objective lens of a traditional microscope. This is accomplished by using synthetic aperture and phase retrieval algorithms to combine many measurements captured by an LED array microscope with programmable source patterns. FPM provides simultaneous large field-of-view and high resolution imaging, but at the cost of reduced temporal resolution, thereby limiting live cell applications. In this work, we learn LED source pattern designs that compress the many required measurements into only a few, with negligible loss in reconstruction quality or resolution. This is accomplished by recasting the super-resolution reconstruction as a Physics-based Neural Network and learning the experimental design to optimize the network's overall performance. Specifically, we learn LED patterns for different applications (e.g. amplitude contrast and quantitative phase imaging) and show that the designs we learn through simulation generalize well in the experimental setting. Further, we discuss a context-specific loss function, practical memory limitations, and interpretability of our learned designs.

h/t Michael's tweet.





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, April 11, 2019

Nuit Blanche is back !

It's been a while since I have been writing much on Nuit Blanche. This post is to let y'all know that I should be more active in the coming weeks, months. We're back !






Join the CompressiveSensing subreddit or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, April 10, 2019

Ce soir Paris Machine Learning #6 Season 6: Adversarial Attacks, ML in Production, XAI, Photo Editing

The meetup is tonight. Thanks to Meero for hosting us and also for providing catering.


==========================================
IMPORTANT : you'll be ask to show an ID at the entrance
==========================================

Schedule :
6:45 Doors open
7PM - 9PM Talk-talk
9PM - 10PM Cocktail - Networking


As usual, there is NO waiting list or reserved seat
First come first served
(the room has 100 seats)


This meetup will be streamed (details later)




Jean-François Goudou, Meero, AI in photo editing: what is reality ?
AI is booming in photo editing.
Editing may seem like a distortion of the reality of the light capture in the camera, and the work of editing as kind of a cheating.
However, cameras are capturing a reality quite different from the one the human eye catches, and editing is a way to recreate what the eye of the photographer saw.


Battista Biggio, Machine Learning Security
Data-driven AI and machine-learning technologies have become pervasive, and even able to outperform humans on specific tasks. However, it has been shown that they suffer from hallucinations known as adversarial examples, i.e., imperceptible, adversarial perturbations to images, text and audio that fool these systems into perceiving things that are not there.
In this talk, I will quickly describe threats against machine learning, and identify possible countermeasures.


Christophe Denis, Explainable and convivial AI tools for healthcare
The lack of explainability of machine learning techniques poses operational, legal and ethical operational problems, in particular for healthcare applications.
Our research project, presented in this talk, consists of providing and evaluating explanations of machine learning methods considered as a black box.
The application should not become a radical monopoly restricting the users choice and freedom, in particular to manage individually some therapeutic dilemmas.
Francis Bacon and Ivan Illich are back !


José Sanchez, Axionable, Machine Learning in production, the challenges to create value
We observe that many entreprises don’t pass the POC stage of their Machine Learning projects, leading to frustration for both technical and business teams.
In this talk we will briefly present the hidden part of the Iceberg of Machine Learning projects (organisation, process and technology), and then show an example of a CI/CD deployment of a ML project on the cloud.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, March 13, 2019

Ce soir, Paris Machine Learning #5 season 6: Explainable AI, Unity Challenge, Ethical AI



Tonight, we will be hosted and sponsored by CFM capital. Thank you to them. 

The schedule is as followd :
6:45 Doors open
7PM - 9PM Talks
9PM - 10PM Cocktail - Networking

As usual, there is NO waiting list or reserved seat First come first served (the room has 110 seats)

This meetup will be streamed see below:



The presentations:

introduction to CFM Capital, Eric Lebigot

Vincent-Pierre Berges, The Obstacle Tower A Generalization Challenge in Vision, Control, and Planning, https://unity3d.com

The rapid pace of research in Deep Reinforcement Learning has been driven by the presence of fast and challenging simulation environments. These environments often take the form of games; with tasks ranging from simple board games, to classic home console games, to modern strategy games. We propose a new benchmark called Obstacle Tower: a high visual fidelity, 3D, 3rd person, procedurally generated game environment. An agent in the Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other similar benchmarks such as the ALE, evaluation of agent performance in Obstacle Tower is based on an agent's ability to perform well on unseen instances of the environment.

$100K AI Contest
Obstacle Tower Challenge

=


Machine learning interpretability is becoming an integral part of the data scientist workflow and can no longer be an afterthought. This talk will explore the vibrant area of machine learning interpretability and explain how to understand black-box models. Thanks to an interpretability technique based on colitional game theory: SHAP.

====


When it comes to actually leverage AI in production and especially in an environment where it interacts with humans, auditability and trust are not optional. That's why Explainable AI becomes a new R&D space. This talks will show why and where explainability in AI is needed, what it actually means and compare some of the techniques that falls into this category.

Impact AI is a think and Do tank that aims to deal with the ethical and societal challenges of AI. We develop an ethical framework for responsible use of Artificial Intelligence respecting principles easy to understand and apply at a large scale. This talk is about the Governance part of this tool box

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, February 13, 2019

The 100th Paris Machine Learning meetup tonight: La revanche des neurones, DL on Healthrecords, Search-Oriented Conversational AI, Nanotechnology and electricity consumption,





In order to come to the meetup, you MUST register on eventbrite to be able to enter + identity card
Link : eventbrite ticket (https://bit.ly/2I1XXL4)

Thanks for Scaleway for hosting us and provide catering !
The video streaming is below. 




As usual, there is no waiting list or reserved seat: first come first served (the room has a 120 seats capacity)

Schedule :
6:45 PM : door opening
7:00 PM : intro + speakers
9:00 PM : networking - cocktail
10:30 PM : door closing

Speakers


La controverse entre IA symbolique et connexioniste

Recent advances in artificial intelligence present a high energy cost, which poses a problem both for the environment and for their integration into connected objects.
This energy cost is due to the fact that AI algorithms are implemented on conventional computers, which are poorly adapted to them.
A promising approach is to use the brain as a model for developing new types of computers that use less energy. In this presentation I explore some key ideas, including the proximity of computation and memory as well as the management of errors and randomness.
Fajwel Fogel, (Sancare), Deep Learning on health records, www.sancare.fr 
Data scientists from Sancare will provide an overview of some of the challenges faced when training deep learning models on electronic health records (EHR), such as robustness and ability to provide explanations
Training deep learning models on electronic health records (EHR) can be prohibitively expensive in terms of computational cost.
Datasets typically include millions of records, each containing several thousands of words.
Moreover, due to the sensitive nature of EHR, all computations must be performed on-premise, i.e., on the campus of the hospital, where GPU resources are usually rare or non-existent.

Chatbots and intelligent personal assistants (such as Siri, Cortana, the Google Assistant, and Amazon Alexa) are being used increasingly more for different purposes, including information access and retrieval.
These dialog systems enable naturalistic human-like interactions where the information needs are expressed in natural language.
Unlike in traditional search engines, where a user-issued query is answered with a search result page, conversational agents can respond in a variety of ways, for example, asking questions back to the user for clarification.
In this talk, I will present our paper "A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems" where we focus on the understanding of natural language expressions for building keyword-based queries. We proposed a reinforcement learning-driven translation model framework able to
1) learn the translation from NL expressions to queries in a supervised way, and,
2) to overcome the lack of large-scale dataset by framing the translation model as a word selection approach and injecting relevance feedback in the learning process. Experiments are carried out on two TREC datasets and outline the effectiveness of our approach.

We use open-data and machine learning to compute and forecast where and how has been produced the electricity you consume



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, December 20, 2018

LightOn: Forward We Go !


It's been a while as I have been a bit busy. I will be back to a more regular schedule but in the meantime, we just raised some funds with Quantonation and Anorak for LightOn that should allow us to go forward in building up an optical technology for Machine Learning. Here are the announcements: 

 We just had some coverage on VentureBeat and you can follow our announcement and progress directly on Twitter and LinkedIn.

Printfriendly