March. 7th, 2017,
Francis Bach (INRIA)
[
video]
Title: Beyond stochastic gradient descent for large-scale machine learning
Abstract: Many machine learning and signal processing problems
are traditionally cast as convex optimization problems. A common
difficulty in solving these problems is the size of the data, where
there are many observations ('large n') and each of these is large
('large p'). In this setting, online algorithms such as stochastic
gradient descent which pass over the data only once, are usually
preferred over batch algorithms, which require multiple passes over the
data. In this talk, I will show how the smoothness of loss functions may
be used to design novel algorithms with improved behavior, both in
theory and practice: in the ideal infinite-data setting, an efficient
novel Newton-based stochastic approximation algorithm leads to a
convergence rate of O(1/n) without strong convexity assumptions, while
in the practical finite-data setting, an appropriate combination of
batch and online algorithms leads to unexpected behaviors, such as a
linear convergence rate for strongly convex problems, with an iteration
cost similar to stochastic gradient descent. (joint work with Nicolas Le
Roux, Eric Moulines and Mark Schmidt).
Jan. 10th, 2017,
Bertrand Thirion (INRIA and Neurospin)
[
video]
Title: A big data approach towards functional brain mapping
Abstract: Functional neuroimaging offers a unique view on brain
functional organization, which is broadly characterized by two
features: the segregation of brain territories into functionally
specialized regions, and the integration of these regions into networks
of coherent activity. Functional Magnetic Resonance Imaging yields a
spatially resolved, yet noisy view of this organization. It also yields
useful measurements of brain integrity to compare populations and
characterize brain diseases. To extract information from these data, a
popular strategy is to rely on supervised classification settings,
where signal patterns are used to predict the experimental task
performed by the subject during a given experiment, which is a proxy
for the cognitive or mental state of this subject. In this talk we
will describe how the reliance on large data copora changes the picture:
it boosts the generalizability of the results and provides meaningful
priors to analyze novel datasets. We will discuss the challenges posed
by these analytic approaches, with an emphasis on computational aspects,
and how the use of non-labelled data can be further used to improve the
model learned from brain activity data.
Nov. 8th, 2016,
Cristopher Moore (Santa Fe Institute)
[
video]
Title: What physics can tell us about inference?
Abstract: There is a deep analogy between statistical inference
and statistical physics; I will give a friendly introduction to both of
these fields. I will then discuss phase transitions in two problems of
interest to a broad range of data sciences: community detection in
social and biological networks, and clustering of sparse
high-dimensional data. In both cases, if our data becomes too sparse or
too noisy, it suddenly becomes impossible to find the underlying
pattern, or even tell if there is one. Physics both helps us locate
these phase transiitons, and design optimal algorithms that succeed all
the way up to this point. Along the way, I will visit ideas from
computational complexity, random graphs, random matrices, and spin glass
theory.
Oct. 11th, 2016,
Jean-Philippe Vert (Mines ParisTech, Institut Curie and ENS)
[
video]
Title: Can Big Data cure Cancer?
Abstract: As the cost and throughput of genomic technologies
reach a point where DNA sequencing is close to becoming a routine exam
at the clinics, there is a lot of hope that treatments of diseases like
cancer can dramatically improve by a digital revolution in medicine,
where smart algorithms analyze « big medical data » to help doctors
take the best decisions for each patient or to suggest new directions
for drug development. While artificial intelligence and machine
learning-based algorithms have indeed had a great impact on many
data-rich fields, their application on genomic data raises numerous
computational and mathematical challenges that I will illustrate on a
few examples of patient stratification or drug response prediction from
genomic data.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.