Nuit Blanche: #NIPS2016: Interpretable Machine Learning for Complex Systems Workshop

Friday, December 30, 2016

#NIPS2016: Interpretable Machine Learning for Complex Systems Workshop

By becoming so central in our daily lives, Machine Learning is facing a new kind of problem that previous areas of engineering used to leave to few specialists, namely is there a way to intepret models used in Machine Learning ? Can we understand the biases of these models ? Can we understand when they fail ? Are the rules it implements in line with the way we think these models should behave ? The Interpretable Machine Learning for Complex Systems workshop at NIPS2016 try to give a glimpse of our collective attempts at answering those very pressing issues. Very few engineering professions have had to face this societal issue directly. Thanks to the organizers: Andrew Gordon Wilson , Been Kim, William Herlands for organizing this workshop and making all the papers available. Here is a list of the papers presented mostly as posters (I have covered two before, here is one). I also note the lack of authors from French entities.

arXiv:1611.07478 [pdf, other]: Title: An unexpected unity among methods for interpreting model predictions

Authors: Scott Lundberg, Su-In Lee
Understanding why a model made a certain prediction is crucial in many data science fields. Interpretable predictions engender appropriate trust and provide insight into how the model may be improved. However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, which creates a tension between accuracy and interpretability. Recently, several methods have been proposed for interpreting predictions from complex models by estimating the importance of input features. Here, we present how a model-agnostic additive representation of the importance of input features unifies current methods. This representation is optimal, in the sense that it is the only set of additive values that satisfies important properties. We show how we can leverage these properties to create novel visual explanations of model predictions. The thread of unity that this representation weaves through the literature indicates that there are common principles to be learned about the interpretation of model predictions that apply in many scenarios.

arXiv:1611.05469 [pdf, other]: Title: Embedding Projector: Interactive Visualization and Interpretation of Embeddings

Authors: Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B. Viégas, Martin Wattenberg
Embeddings are ubiquitous in machine learning, appearing in recommender systems, NLP, and many other applications. Researchers and developers often need to explore the properties of a specific embedding, and one way to analyze embeddings is to visualize them. We present the Embedding Projector, a tool for interactive visualization and interpretation of embeddings.

arXiv:1611.07567 [pdf, other]: Title: Feature Importance Measure for Non-linear Learning Algorithms

Authors: Marina M.-C. Vidovic, Nico Görnitz, Klaus-Robert Müller, Marius Kloft
Complex problems may require sophisticated, non-linear learning methods such as kernel machines or deep neural networks to achieve state of the art prediction accuracies. However, high prediction accuracies are not the only objective to consider when solving problems using machine learning. Instead, particular scientific applications require some explanation of the learned prediction function. Unfortunately, most methods do not come with out of the box straight forward interpretation. Even linear prediction functions are not straight forward to explain if features exhibit complex correlation structure.
In this paper, we propose the Measure of Feature Importance (MFI). MFI is general and can be applied to any arbitrary learning machine (including kernel machines and deep learning). MFI is intrinsically non-linear and can detect features that by itself are inconspicuous and only impact the prediction function through their interaction with other features. Lastly, MFI can be used for both --- model-based feature importance and instance-based feature importance (i.e, measuring the importance of a feature for a particular data point).

arXiv:1611.05940 [pdf, other]: Title: Finding Alternate Features in Lasso

Authors: Satoshi Hara, Takanori Maehara
We propose a method for finding alternate features missing in the Lasso optimal solution. In ordinary Lasso problem, one global optimum is obtained and the resulting features are interpreted as task-relevant features. However, this can overlook possibly relevant features not selected by the Lasso. With the proposed method, we can provide not only the Lasso optimal solution but also possible alternate features to the Lasso solution. We show that such alternate features can be computed efficiently by avoiding redundant computations. We also demonstrate how the proposed method works in the 20 newsgroup data, which shows that reasonable features are found as alternate features.

arXiv:1611.07051 [pdf, other]: Title: Gaussian Process Structure Learning via Probabilistic Inverse Compilation

Authors: Ulrich Schaechtle, Vikash K. Mansinghka
There is a widespread need for techniques that can learn interpretable models from data. Recent work by Duvenaud et al. (2013) and Lloyd et al. (2014) showed that it is possible to use Gaussian Processes (GPs) to discover symbolic structure in univariate time series. This abstract shows how to reimplement the approach from Duvenaud et al. (2013) using under 100 lines of probabilistic code in Venture (Mansinghka et al., 2014; Lu, 2016), improving on a previous implementation from Schaechtle et al. (2015). The key idea is to formulate structure learning as a kind of probabilistic inverse compilation, where the kernel structure is represented as source code and the resulting GP model is represented as an executable probabilistic program produced by compiling that source code. Figures 1 and 2 give an overview of the inverse compilation framework. Figure 3 shows example kernel structures, including program source, an English summary, and typical data corresponding to the given structure. Figure 4 shows the complete Venture source code for reimplementing the approach from Duvenaud et al. (2013), and Figure 5 shows an application to real-world time series data describing air travel volume.

arXiv:1611.05722 [pdf, other]: Title: GENESIM: genetic extraction of a single, interpretable model

Authors: Gilles Vandewiele, Olivier Janssens, Femke Ongenae, Filip De Turck, Sofie Van Hoecke
Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, instead of decision making, is crucial.
To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees to a single decision tree with an enhanced predictive performance by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction and ensemble techniques using twelve publicly available data sets. The results show that GENESIM achieves a better predictive performance on most of these data sets than decision tree induction techniques and a predictive performance in the same order of magnitude as the ensemble techniques. Moreover, the resulting model of GENESIM has a very low complexity, making it very interpretable, in contrast to ensemble techniques.

arXiv:1611.07450 [pdf, other]: Title: Grad-CAM: Why did you say that?

Authors: Ramprasaath R Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra
We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations. Our approach, called Gradient-weighted Class Activation Mapping (Grad-CAM), uses class-specific gradient information to localize important regions. These localizations are combined with existing pixel-space visualizations to create a novel high-resolution and class-discriminative visualization called Guided Grad-CAM. These methods help better understand CNN-based models, including image captioning and visual question answering (VQA) models. We evaluate our visual explanations by measuring their ability to discriminate between classes, to inspire trust in humans, and their correlation with occlusion maps. Grad-CAM provides a new way to understand CNN-based models.
We have released code, an online demo hosted on CloudCV, and a full version of this extended abstract.

arXiv:1611.08292 [pdf, other]: Title: Identifying Significant Predictive Bias in Classifiers

Authors: Zhe Zhang, Daniel B. Neill
We present a novel subset scan method to detect if a probabilistic binary classifier has statistically significant bias -- over or under predicting the risk -- for some subgroup, and identify the characteristics of this subgroup. This form of model checking and goodness-of-fit test provides a way to interpretably detect the presence of classifier bias and poor classifier fit, not just in one or two dimensions of features of a priori interest, but in the space of all possible feature subgroups. We use subset scan and parametric bootstrap methods to efficiently address the difficulty of assessing the exponentially many possible subgroups. We also suggest several useful extensions of this method for increasing interpretability of predictive models and prediction performance.

arXiv:1611.05934 [pdf, other]: Title: Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models

Authors: Viktoriya Krakovna, Finale Doshi-Velez
As deep neural networks continue to revolutionize various application domains, there is increasing interest in making these powerful models more understandable and interpretable, and narrowing down the causes of good and bad predictions. We focus on recurrent neural networks, state of the art models in speech recognition and translation. Our approach to increasing interpretability is by combining a long short-term memory (LSTM) model with a hidden Markov model (HMM), a simpler and more transparent model. We add the HMM state probabilities to the output layer of the LSTM, and then train the HMM and LSTM either sequentially or jointly. The LSTM can make use of the information from the HMM, and fill in the gaps when the HMM is not performing well. A small hybrid model usually performs better than a standalone LSTM of the same size, especially on smaller data sets. We test the algorithms on text data and medical time series data, and find that the LSTM and HMM learn complementary information about the features in the text.

arXiv:1611.07492 [pdf, other]: Title: Inducing Interpretable Representations with Variational Autoencoders

Authors: N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem Van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr
We develop a framework for incorporating structured graphical models in the \emph{encoders} of variational autoencoders (VAEs) that allows us to induce interpretable representations through approximate variational inference. This allows us to both perform reasoning (e.g. classification) under the structural constraints of a given graphical model, and use deep generative models to deal with messy, high-dimensional domains where it is often difficult to model all the variation. Learning in this framework is carried out end-to-end with a variational objective, applying to both unsupervised and semi-supervised schemes.

arXiv:1611.07252 [pdf, other]: Title: Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery

Authors: Scott Wisdom, Thomas Powers, James Pitton, Les Atlas
Recurrent neural networks (RNNs) are powerful and effective for processing sequential data. However, RNNs are usually considered "black box" models whose internal structure and learned parameters are not interpretable. In this paper, we propose an interpretable RNN based on the sequential iterative soft-thresholding algorithm (SISTA) for solving the sequential sparse recovery problem, which models a sequence of correlated observations with a sequence of sparse latent vectors. The architecture of the resulting SISTA-RNN is implicitly defined by the computational structure of SISTA, which results in a novel stacked RNN architecture. Furthermore, the weights of the SISTA-RNN are perfectly interpretable as the parameters of a principled statistical model, which in this case include a sparsifying dictionary, iterative step size, and regularization parameters. In addition, on a particular sequential compressive sensing task, the SISTA-RNN trains faster and achieves better performance than conventional state-of-the-art black box RNNs, including long-short term memory (LSTM) RNNs.

arXiv:1611.07100 [pdf, other]: Title: Interpreting Finite Automata for Sequential Data

Authors: Christian Albert Hammerschmidt, Sicco Verwer, Qin Lin, Radu State
Automaton models are often seen as interpretable models. Interpretability itself is not well defined: it remains unclear what interpretability means without first explicitly specifying objectives or desired attributes. In this paper, we identify the key properties used to interpret automata and propose a modification of a state-merging approach to learn variants of finite state automata. We apply the approach to problems beyond typical grammar inference tasks. Additionally, we cover several use-cases for prediction, classification, and clustering on sequential data in both supervised and unsupervised scenarios to show how the identified key properties are applicable in a wide range of contexts.

arXiv:1611.07634 [pdf, ps, other]: Title: Interpretation of Prediction Models Using the Input Gradient

Authors: Yotam Hechtlinger
State of the art machine learning algorithms are highly optimized to provide the optimal prediction possible, naturally resulting in complex models. While these models often outperform simpler more interpretable models by order of magnitudes, in terms of understanding the way the model functions, we are often facing a "black box".
In this paper we suggest a simple method to interpret the behavior of any predictive model, both for regression and classification. Given a particular model, the information required to interpret it can be obtained by studying the partial derivatives of the model with respect to the input. We exemplify this insight by interpreting convolutional and multi-layer neural networks in the field of natural language processing.

arXiv:1611.08191 [pdf, other]: Title: Interpreting the Predictions of Complex ML Models by Layer-wise Relevance Propagation

Authors: Wojciech Samek, Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Klaus-Robert Müller
Complex nonlinear models such as deep neural network (DNNs) have become an important tool for image classification, speech recognition, natural language processing, and many other fields of application. These models however lack transparency due to their complex nonlinear structure and to the complex data distributions to which they typically apply. As a result, it is difficult to fully characterize what makes these models reach a particular decision for a given input. This lack of transparency can be a drawback, especially in the context of sensitive applications such as medical analysis or security. In this short paper, we summarize a recent technique introduced by Bach et al. [1] that explains predictions by decomposing the classification decision of DNN models in terms of input variables.

arXiv:1611.04887 [pdf, ps, other]: Title: Interpreting the Syntactic and Social Elements of the Tweet Representations via Elementary Property Prediction Tasks

Authors: J Ganesh, Manish Gupta, Vasudeva Varma
Research in social media analysis is experiencing a recent surge with a large number of works applying representation learning models to solve high-level syntactico-semantic tasks such as sentiment analysis, semantic textual similarity computation, hashtag prediction and so on. Although the performance of the representation learning models are better than the traditional baselines for the tasks, little is known about the core properties of a tweet encoded within the representations. Understanding these core properties would empower us in making generalizable conclusions about the quality of representations. Our work presented here constitutes the first step in opening the black-box of vector embedding for social media posts, with emphasis on tweets in particular.
In order to understand the core properties encoded in a tweet representation, we evaluate the representations to estimate the extent to which it can model each of those properties such as tweet length, presence of words, hashtags, mentions, capitalization, and so on. This is done with the help of multiple classifiers which take the representation as input. Essentially, each classifier evaluates one of the syntactic or social properties which are arguably salient for a tweet. This is also the first holistic study on extensively analysing the ability to encode these properties for a wide variety of tweet representation models including the traditional unsupervised methods (BOW, LDA), unsupervised representation learning methods (Siamese CBOW, Tweet2Vec) as well as supervised methods (CNN, BLSTM).

arXiv:1611.07270 [pdf, ps, other]: Title: Investigating the influence of noise and distractors on the interpretation of neural networks

Authors: Pieter-Jan Kindermans, Kristof Schütt, Klaus-Robert Müller, Sven Dähne
Understanding neural networks is becoming increasingly important. Over the last few years different types of visualisation and explanation methods have been proposed. However, none of them explicitly considered the behaviour in the presence of noise and distracting elements. In this work, we will show how noise and distracting dimensions can influence the result of an explanation model. This gives a new theoretical insights to aid selection of the most appropriate explanation model within the deep-Taylor decomposition framework.

arXiv:1611.06843 [pdf, other]: Title: Joint Hacking and Latent Hazard Rate Estimation

Authors: Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng
In this paper we describe an algorithm for predicting the websites at risk in a long range hacking activity, while jointly inferring the provenance and evolution of vulnerabilities on websites over continuous time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions constrained with total variation penalty inspired by hacking campaigns. We show that the optimal solution is a 0th order spline with a finite number of adaptively chosen knots, and can be solved efficiently. Experiments on real data show that our method significantly outperforms classic methods while providing meaningful interpretability.

arXiv:1611.07663 [pdf, ps, other]: Title: Learning Cost-Effective and Interpretable Regimes for Treatment Recommendation

Authors: Himabindu Lakkaraju, Cynthia Rudin
Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bails to defendants on a daily basis. Such decisions typically involve weighting the potential benefits of taking an action against the costs involved. In this work, we aim to automate this task of learning {cost-effective, interpretable and actionable treatment regimes. We formulate this as a problem of learning a decision list -- a sequence of if-then-else rules -- which maps characteristics of subjects (eg., diagnostic test results of patients) to treatments. We propose a novel objective to construct a decision list which maximizes outcomes for the population, and minimizes overall costs. We model the problem of learning such a list as a Markov Decision Process (MDP) and employ a variant of the Upper Confidence Bound for Trees (UCT) strategy which leverages customized checks for pruning the search space effectively. Experimental results on real world observational data capturing treatment recommendations for asthma patients demonstrate the effectiveness of our approach.

arXiv:1611.06175 [pdf, other]: Title: Learning Interpretability for Visualizations using Adapted Cox Models through a User Experiment

Authors: Adrien Bibal, Benoit Frénay
In order to be useful, visualizations need to be interpretable. This paper uses a user-based approach to combine and assess quality measures in order to better model user preferences. Results show that cluster separability measures are outperformed by a neighborhood conservation measure, even though the former are usually considered as intuitively representative of user motives. Moreover, combining measures, as opposed to using a single measure, further improves prediction performances.

arXiv:1610.07647 [pdf, other]: Title: Learning to Reason With Adaptive Computation

Authors: Mark Neumann, Pontus Stenetorp, Sebastian Riedel
Multi-hop inference is necessary for machine learning systems to successfully solve tasks such as Recognising Textual Entailment and Machine Reading. In this work, we demonstrate the effectiveness of adaptive computation for learning the number of inference steps required for examples of different complexity and that learning the correct number of inference steps is difficult. We introduce the first model involving Adaptive Computation Time which provides a small performance benefit on top of a similar model without an adaptive component as well as enabling considerable insight into the reasoning process of the model.

arXiv:1611.07443 [pdf, other]: Title: Mapping chemical performance on molecular structures using locally interpretable explanations

Authors: Leanne S. Whitmore, Anthe George, Corey M. Hudson
In this work, we present an application of Locally Interpretable Machine-Agnostic Explanations to 2-D chemical structures. Using this framework we are able to provide a structural interpretation for an existing black-box model for classifying biologically produced fuel compounds with regard to Research Octane Number. This method of "painting" locally interpretable explanations onto 2-D chemical structures replicates the chemical intuition of synthetic chemists, allowing researchers in the field to directly accept, reject, inform and evaluate decisions underlying inscrutably complex quantitative structure-activity relationship models.

arXiv:1611.06800 [pdf, other]: Title: MDL-motivated compression of GLM ensembles increases interpretability and retains predictive power

Authors: Boris Hayete, Matthew Valko, Alex Greenfield, Raymond Yan
Over the years, ensemble methods have become a staple of machine learning. Similarly, generalized linear models (GLMs) have become very popular for a wide variety of statistical inference tasks. The former have been shown to enhance out- of-sample predictive power and the latter possess easy interpretability. Recently, ensembles of GLMs have been proposed as a possibility. On the downside, this approach loses the interpretability that GLMs possess. We show that minimum description length (MDL)-motivated compression of the inferred ensembles can be used to recover interpretability without much, if any, downside to performance and illustrate on a number of standard classification data sets.

arXiv:1611.06928 [pdf, other]: Title: Memory Lens: How Much Memory Does an Agent Use?

Authors: Christoph Dann, Katja Hofmann, Sebastian Nowozin
We propose a new method to study the internal memory used by reinforcement learning policies. We estimate the amount of relevant past information by estimating mutual information between behavior histories and the current action of an agent. We perform this estimation in the passive setting, that is, we do not intervene but merely observe the natural behavior of the agent. Moreover, we provide a theoretical justification for our approach by showing that it yields an implementation-independent lower bound on the minimal memory capacity of any agent that implement the observed policy. We demonstrate our approach by estimating the use of memory of DQN policies on concatenated Atari frames, demonstrating sharply different use of memory across 49 games. The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.

arXiv:1611.08070 [pdf, other]: Title: Multiscale Inverse Reinforcement Learning using Diffusion Wavelets

Authors: Jung-Su Ha, Han-Lim Choi
This work presents a multiscale framework to solve an inverse reinforcement learning (IRL) problem for continuous-time/state stochastic systems. We take advantage of a diffusion wavelet representation of the associated Markov chain to abstract the state space. This not only allows for effectively handling the large (and geometrically complex) decision space but also provides more interpretable representations of the demonstrated state trajectories and also of the resulting policy of IRL. In the proposed framework, the problem is divided into the global and local IRL, where the global approximation of the optimal value functions are obtained using coarse features and the local details are quantified using fine local features. An illustrative numerical example on robot path control in a complex environment is presented to verify the proposed method.

arXiv:1611.05817 [pdf, other]: Title: Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance

Authors: Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

At the core of interpretable machine learning is the question of whether humans are able to make accurate predictions about a model's behavior. Assumed in this question are three properties of the interpretable output: coverage, precision, and effort. Coverage refers to how often humans think they can predict the model's behavior, precision to how accurate humans are in those predictions, and effort is either the up-front effort required in interpreting the model, or the effort required to make predictions about a model's behavior.
In this work, we propose anchor-LIME (aLIME), a model-agnostic technique that produces high-precision rule-based explanations for which the coverage boundaries are very clear. We compare aLIME to linear LIME with simulated experiments, and demonstrate the flexibility of aLIME with qualitative examples from a variety of domains and tasks.

arXiv:1611.07579 [pdf, other]: Title: Programs as Black-Box Explanations

Authors: Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin
Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility. However, it is not clear what kind of explanations, such as linear models, decision trees, and rule lists, are the appropriate family to consider, and different tasks and models may benefit from different kinds of explanations. Instead of picking a single family of representations, in this work we propose to use "programs" as model-agnostic explanations. We show that small programs can be expressive yet intuitive as explanations, and generalize over a number of existing interpretable families. We propose a prototype program induction method based on simulated annealing that approximates the local behavior of black-box classifiers around a specific prediction using random perturbations. Finally, we present preliminary application on small datasets and show that the generated explanations are intuitive and accurate for a number of classifiers.

arXiv:1611.06996 [pdf, ps, other]: Title: Spatial contrasting for deep unsupervised learning

Authors: Elad Hoffer, Itay Hubara, Nir Ailon
Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training methods. In this work we present a novel approach for unsupervised training of Convolutional networks that is based on contrasting between spatial regions within images. This criterion can be employed within conventional neural networks and trained using standard techniques such as SGD and back-propagation, thus complementing supervised methods.

arXiv:1611.06174 [pdf, other]: Title: Stratified Knowledge Bases as Interpretable Probabilistic Models (Extended Abstract)

Authors: Ondrej Kuzelka, Jesse Davis, Steven Schockaert
In this paper, we advocate the use of stratified logical theories for representing probabilistic models. We argue that such encodings can be more interpretable than those obtained in existing frameworks such as Markov logic networks. Among others, this allows for the use of domain experts to improve learned models by directly removing, adding, or modifying logical formulas.

arXiv:1611.08083 [pdf, other]: Title: Survey of Expressivity in Deep Neural Networks

Authors: Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, Jascha Sohl-Dickstein
We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power -- in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.

arXiv:1611.07115 [pdf, other]: Title: Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable

Authors: Hui Fen Tan, Giles Hooker, Martin T. Wells
Ensembles of decision trees have good prediction accuracy but suffer from a lack of interpretability. We propose a new approach for interpreting tree ensembles by finding prototypes in tree space, utilizing the naturally-learned similarity measure from the tree ensemble. Demonstrating the method on random forests, we show that the method benefits from unique aspects of tree ensembles by leveraging tree structure to sequentially find prototypes. The method provides good prediction accuracy when found prototypes are used in nearest-prototype classifiers, while using fewer prototypes than competitor methods. We are investigating the sensitivity of the method to different prototype-finding procedures and demonstrating it on higher-dimensional data.

arXiv:1611.07429 [pdf, other]: Title: TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning

Authors: Jayaraman J. Thiagarajan, Bhavya Kailkhura, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy
With the advent of highly predictive but opaque deep learning models, it has become more important than ever to understand and explain the predictions of such models. Existing approaches define interpretability as the inverse of complexity and achieve interpretability at the cost of accuracy. This introduces a risk of producing interpretable but misleading explanations. As humans, we are prone to engage in this kind of behavior \cite{mythos}. In this paper, we take a step in the direction of tackling the problem of interpretability without compromising the model accuracy. We propose to build a Treeview representation of the complex model via hierarchical partitioning of the feature space, which reveals the iterative rejection of unlikely class labels until the correct association is predicted.

Honglak Lee -- University of Michigan || Website

Title: Interpretable Deep Learning with Disentangled Representations

@9:00am

Over the recent years, deep learning has emerged as a powerful method for learning feature representations from complex input data, and it has been greatly successful in computer vision, speech recognition, and language modeling. While many deep learning algorithms focus on a discriminative task and extract only task-relevant features that are invariant to other factors, complex sensory data is often generated from intricate interaction between underlying factors of variations (for example, pose, morphology and viewpoints for 3d object images).

In this work, we tackle the problem of learning deep representations that disentangle underlying factors of variation and allow for complex visual reasoning and inference, as well as better interpretability. We present several successful instances of deep architectures and their learning methods in supervised and weakly-supervised settings. Further, I will talk about visual analogy making with disentangled representations, as well as a connection between disentangling and unsupervised learning. In the second part of the talk, I will describe my work on learning deep representations from multiple heterogeneous input modalities, which provides connections between disentangling, multimodal learning, joint embedding, and conditional generation (e.g., generating images from text descriptions). Finally, I will show how disentangled representations can be useful for predicting future in temporal sequence data (e.g., videos) and some reinforcement learning tasks.

Finale Doshi-Velez -- Harvard || Website

Title: Why Interpretability: A Taxonomy of Interpretability and Implications for Principled Evaluation
@9:30am
With a growing interest in interpretability, there is an increasing need to characterize what exactly we mean by it and how to sensibly compare the interpretability of different approaches. In this talk, I suggest that our current desire for "interpretability" is as vague as asking for "good predictions" -- a desire that. while entirely reasonable, must be formalized into concrete needs such as high average test performance (perhaps held-out likelihood is a good metric) or some kind of robust performance (perhaps sensitivity or specificity are more appropriate metrics). This objective of this talk is to start a conversation to do the same for interpretability: I will describe distinct, concrete objectives that all fall under the umbrella term of interpretability and how each objective suggests natural evaluation procedures. I will also describe highlight important open questions in the evaluation of interpretable models.
Joint work with Been Kim, and the product of discussions with countless collaborators and colleagues.

Rich Caruana -- Google || Website

Title: Intelligible Machine Learning for HealthCare
@11:00am

In machine learning often a tradeoff must be made between accuracy and intelligibility: the most accurate models usually are not very intelligible (e.g., deep neural nets, boosted trees, and random forests), and the most intelligible models usually are less accurate (e.g., linear/logistic regression). This tradeoff often limits the accuracy of models that can be applied in mission-critical applications such as healthcare where being able to understand, validate, edit, and ultimately trust a learned model is important. We have developed a learning method based on generalized additive models (GAMs) that is often as accurate as full complexity models, but remains as intelligible as linear/logistic regression models. In the talk I’ll present two case studies where these high-performance generalized additive models (GA2Ms) are applied to healthcare problems yielding intelligible models with state-of-the-art accuracy. In the pneumonia risk prediction case study, the intelligible model uncovers surprising patterns in the data that previously prevented complex learned models from being deployed, but because it is intelligible and modular allows these patterns to easily be recognized and removed. In the 30-day hospital readmission case study, we show that the same methods scale to large datasets containing hundreds of thousands of patients and thousands of attributes while remaining intelligible and providing accuracy comparable to the best (unintelligible) machine learning methods.

Maya Gupta -- Google || Website

Title: The Power of Monotonicity for Practical Machine Learning

@2:30pm

What prior knowledge do humans have about machine learning problems that we can take advantage of as regularizers? One common intuition is that certain inputs should have a positive (only) effect on the output, for example, the price of a house should only increase as its size goes up, if all else is the same. Incorporating such monotonic priors into our machine learning algorithms can dramatically increase their interpretability and debuggability. We'll discuss state-of-the-art algorithms to learn flexible monotonic functions, and share some stories about why monotonicity is such an important regularizer for practical problems where train and test samples are not IID, especially when learning from clicks.

Jonathan Pillow -- Princeton || Website

Title: Finding interpretable sparse structure in fMRI data with dependent relevance determination priors
@3:30pm
In many problem settings, parameters are not merely sparse, but dependent in such a way that non-zero coefficients tend to cluster together. We refer to this form of dependency as region sparsity". Classical sparse regression methods, such as the lasso and automatic relevance determination (ARD), which models parameters as independent a priori, and therefore do not exploit such dependencies. Here we introduce a hierarchical model for smooth, region-sparse weight vectors and tensors in a linear regression setting. Our approach represents a hierarchical extension of the relevance determination framework, where we add a transformed Gaussian process to model the dependencies between the prior variances of regression weights. We combine this with a structured model of the prior variances of Fourier coefficients, which eliminates unnecessary high frequencies. The resulting prior encourages weights to be region-sparse in two different bases simultaneously. We develop Laplace approximation and Monte Carlo Markov Chain (MCMC) sampling to provide efficient inference for the posterior, and show substantial improvements over existing methods for both simulated and real fMRI datasets.

Saleema Amershi -- Microsoft Research || Website

Title: Better Machine Learning Through Data

@4:30pm
Machine learning is the product of both an algorithm and data. While machine learning research tends to focus on algorithmic advances, taking the data as given, machine learning practice is quite the opposite. Most of the influence practitioners have in using machine learning to build predictive models comes through interacting with data, including crafting the data used for training and examining results on new data to inform future iterations. In this talk, I will present tools and techniques we have been developing in the Machine Teaching Group at Microsoft Research to support the model building process. I will then discuss some of the open challenges and opportunities in improving the practice of machine learning.