Here are the abstacts and slides of the tutorials at ICML2015
Download slides part 1 – part 2.
Advances in Structured PredictionHal Daumé III (University of Maryland) and John Langford (Microsoft Research).
Structured prediction is the problem of making a joint set of decisions to optimize a joint loss. There are two families of algorithms for such problems: Graphical model approaches and learning to search approaches. Graphical models include Conditional Random Fields and Structured SVMs and are effective when writing down a graphical model and solving it is easy. Learning to search approaches, explicitly predict the joint set of decisions incrementally, conditioning on past and future decisions. Such models may be particularly useful when the dependencies between the predictions are complex, the loss is complex, or the construction of an explicit graphical model is impossible.
We will describe both approaches, with a deeper focus on the latter learning-to-search paradigm, which has less tutorial support. This paradigm has been gaining increasing traction over the past five years, making advances in natural language processing (dependency parsing, semantic parsing), robotics (grasping and path planning), social network analysis and computer vision (object segmentation).
Bayesian Time Series Modeling: Structured Representations for ScalabilityEmily Fox (University of Washington).
Time series of increasing complexity are being collected in a variety of fields ranging from neuroscience, genomics, and environmental monitoring to e-commerce based on technologies and infrastructures previously unavailable. These datasets can be viewed either as providing a single, high-dimensional time series or as a massive collection of time series with intricate and possibly evolving relationships between them. For scalability, it is crucial to discover and exploit sparse dependencies between the data streams or dimensions. Such representational structures for independent data sources have been extensively explored in the machine learning community. However, in the conversation on big data, despite the importance and prevalence of time series, the question of how to analyze such data at scale has received limited attention and represents an area of research opportunities.
For these time series of interest, there are two key modeling components: the dynamic and relational models, and their interplay. In this tutorial, we will review some foundational time series models, including the hidden Markov model (HMM) and vector autoregressive (VAR) process. Such dynamical models and their extensions have proven useful in capturing complex dynamics of individual data streams such as human motion, speech, EEG recordings, and genome sequences. However, a focus of this tutorial will be on how to deploy scalable representational structures for capturing sparse dependencies between data streams. In particular, we consider clustering, directed and undirected graphical models, and low-dimensional embeddings in the context of time series. An emphasis is on learning such structure from the data. We will also provide some insights into new computational methods for performing efficient inference in large-scale time series.
Throughout the tutorial we will highlight Bayesian and Bayesian nonparametric approaches for learning and inference. Bayesian methods provide an attractive framework for examining complex data streams by naturally incorporating and propagating notions of uncertainty and enabling integration of heterogenous data sources; the Bayesian nonparametric aspect allows the complexity of the dynamics and relational structure to adapt to the observed data.
Natural Language Understanding: Foundations and State-of-the-ArtPercy Liang (Stanford University).
Building systems that can understand human language—being able to answer questions, follow instructions, carry on dialogues—has been a long-standing challenge since the early days of AI. Due to recent advances in machine learning, there is again renewed interest in taking on this formidable task. A major question is how one represents and learns the semantics (meaning) of natural language, to which there are only partial answers. The goal of this tutorial is (i) to describe the linguistic and statistical challenges that any system must address; and (ii) to describe the types of cutting edge approaches and the remaining open problems. Topics include distributional semantics (e.g., word vectors), frame semantics (e.g., semantic role labeling), model-theoretic semantics (e.g., semantic parsing), the role of context, grounding, neural networks, latent variables, and inference. The hope is that this unified presentation will clarify the landscape, and show that this is an exciting time for the machine learning community to engage in the problems in natural language understanding.
Policy Search: Methods and ApplicationsGerhard Neumann (Technische Universität Darmstadt) and Jan Peters (Technische Universität Darmstadt & Max Planck Institute for Intelligent Systems, Tübingen).
Policy search is a subfield in reinforcement learning which focuses on finding good parameters for a given policy parametrization. It is well suited for robotics as it can cope with high-dimensional state and action spaces, one of the main challenges in robot learning. We review recent successes of both model-free and model-based policy search in robot learning.
Model-free policy search is a general approach to learn policies based on sampled trajectories. We classify model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and present a unified view on existing algorithms. Learning a policy is often easier than learning an accurate forward model, and, hence, model-free methods are more frequently used in practice. How- ever, for each sampled trajectory, it is necessary to interact with the robot, which can be time consuming and challenging in practice. Model-based policy search addresses this problem by first learning a simulator of the robot’s dynamics from data. Subsequently, the simulator generates trajectories that are used for policy learning. For both model- free and model-based policy search methods, we review their respective properties and their applicability to robotic systems.
Modern Convex Optimization Methods for Large-scale Empirical Risk MinimizationPeter Richtárik (University of Edimburgh) and Mark Schmidt (University of British Columbia).
Download slides part 1 – part 2.
This tutorial reviews recent advances in convex optimization for training (linear) predictors via (regularized) empirical risk minimization. We exclusively focus on practically efficient methods which are also equipped with complexity bounds confirming the suitability of the algorithms for solving huge-dimensional problems (a very large number of examples or a very large number of features).
The first part of the tutorial is dedicated to modern primal methods (belonging to the stochastic gradient descent variety), while the second part focuses on modern dual methods (belonging to the randomized coordinate ascent variety). While we make this distinction, there are very close links between the primal and dual methods, some of which will be highlighted. We shall also comment on mini-batch, parallel and distributed variants of the methods as this is an important consideration for applications involving big data.
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.