## Page Views on Nuit Blanche since July 2010

My papers on ArXiv:
Approximating Kernels at the speed of Light
&
Imaging with Nature

LightOn
LinkedIn (727)|| on CrunchBase || our Blog
(2452)
(3967)
(1333)||
Attendant references pages:
The Advanced Matrix Factorization Jungle Page ||

Paris Machine Learning
@Meetup.com (8016 members) || @Archives

## Saturday, November 28, 2015

### Saturday Morning Video: Use of hyperspectral cameras based on CMOS area scan sensors

I picked this up from Vladimir's ever great blog. Recall the IMEC technology doing hyperspectral with CMOS and described in Do Not Mess with CMOS, well, it's is now on sale. Better yet, you can even win a starter kit by submitting an idea here.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, November 27, 2015

### Slides and Videos: Randomized Numerical Linear Algebra reading group, Berkeley, Spring 2015

Laura Grigori co-organized with Jim Demmel, Ming Gu, Michael Mahoney, the Randomized Numerical Linear Algebra reading group at Berkeley this past spring. Here are the slides and videos of the presentations that occured then. From her page:

Archived video of the lectures may be seen here
• Feb 17 - M. Mahoney: Randomized Algorithms for Matrices and Data, slides
• Feb 24 - J. Demmel: notes of the lecture , video of the lecture
• Mar 3 - J. Demmel, second part of lecture from Feb 24, notes of the lecture , video of the lecture
• Mar 10 - M. Gu: Subspace iteration randomization and singular value problems, slides , video of the lecture , arxiv.org
• Mar 17: Eric Hallman, based on the paper Sketching as a Tool for Numerical Linear Algebra, Woodruff, video of the lecture
• Mar 24 - spring break Mar 31 - Becca Roelofs, video of the lecture based on the paper Blendenpik: Supercharging LAPACK's least-squares solver, Avron et al.
• Apr 7 - Arturo Fernandez, slides based on the paper Relative-Error CUR Matrix Decompositions, Drineas et al, and Shivaram Venkataraman, High Performance Linear Algebra using Spark, video of the lecture, lecture starts at 3:59
• Apr 14 - Aydin Buluc, based on the paper Low Rank Approximation and Regression in Input Sparsity Time, Clarkson and Woodruff, and Chris Melgaard.video of the lecture
• Apr 21: Laura Grigori, CA rank revealing QR and LU factorizations, and Pieter Ghysels, Construction of hierarchically semi-separable matrices using randomized sampling and application in a sparse direct solver, video of the lecture, lecture starts at ~2:44
• Apr 28 - Yudong Chen, Fast projected gradient descent algorithms for low-rank estimation video of the lecture, lecture starts at 4:59

Abstract: Fitting a rank-r matrix to noisy data is in general NP-hard. A popular approach is by convex relaxations via nuclear/trace norm minimization. This approach is shown to provide strong (often order-wise unimprovable) statistical guarantees in terms of error bounds and sample complexity. Computationally, while nuclear norm minimization can be solved in polynomial time in principle by semidefinite programming, its time complexity is often too high for large matrices. In this talk, we consider an alternative approach via projected gradient descent over the space of n-by-r matrices, which scales well to large instances. Moreover, we develop a unified framework characterizing the convergence of projected gradient descent for a broad range of non-convex low-rank estimation problems. Our results apply to the problems of matrix sensing, matrix completion, robust PCA, sparse PCA, densest subgraph detection and others, for which we match the best known statistical guarantees provided by convex relaxation methods.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### CSJobs: Postdocs / PhDs / internships opportunities, Saclay, France

Jerome let me know of a few postdoc/PhD/internship opportunities in his lab, here they are:

## Post-doc positions

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, November 25, 2015

### Gradual DropIn of Layers to Train Very Deep Neural Networks

Leslie just mentioned to me that of he and his colleague's foray into deep learning. Here it is:

We introduce the concept of dynamically growing a neural network during training. In particular, an untrainable deep network starts as a trainable shallow network and newly added layers are slowly, organically added during training, thereby increasing the network's depth. This is accomplished by a new layer, which we call DropIn. The DropIn layer starts by passing the output from a previous layer (effectively skipping over the newly added layers), then increasingly including units from the new layers for both feedforward and backpropagation. We show that deep networks, which are untrainable with conventional methods, will converge with DropIn layers interspersed in the architecture. In addition, we demonstrate that DropIn provides regularization during training in an analogous way as dropout. Experiments are described with the MNIST dataset and various expanded LeNet architectures, CIFAR-10 dataset with its architecture expanded from 3 to 11 layers, and on the ImageNet dataset with the AlexNet architecture expanded to 13 layers and the VGG 16-layer architecture.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, November 24, 2015

### Blind and reference-free fluorescence lifetime estimation via consumer time-of-flight sensors

Ayush just sent me the following:

Hi Igor:

I have been an avid follower of you blog since time immemorial. ;)

I am not sure if the content of this email is interesting for your blog but given your recent discussion on computational imaging, I would like to bring to your notice a recent paper of ours that appeared in the latest volume of Optica.

The theme of this paper is to show that Kinect-like Time-of-Flight sensors can be used for purposes that go beyond conventional depth imaging.

The particular application we had in mind was Fluorescence Lifetime Imaging. Conventional apparatus costs about $1000K however, by re-purposing ToF sensors, we can convert Kinect into a lifetime estimation device. The project website is here: http://fli-tof.info A more user-friendly coverage on MIT news is here: http://news.mit.edu/2015/biomedical-imaging-lower-cost-1123 In case you think that this work is of interest to your audience, may I please request you to keep this email private? Now that we have a functional proof-of-concept, we are discussing more theoretical aspects of the work which is in pipeline. Ayush Awesomeness (of the paper) and flattery always work Ayush ! Fluorescence lifetime imaging (FLI) is a popular method for extracting useful information that is otherwise unavailable from a conventional intensity image. Usually, however, it requires expensive equipment, is often limited to either distinctly frequency- or time-domain modalities, and demands calibration measurements and precise knowledge of the illumination signal. Here, we present a generalized time-based, cost-effective method for estimating lifetimes by repurposing a consumer-grade time-of-flight sensor. By developing mathematical theory that unifies time- and frequency-domain approaches, we can interpret a time-based signal as a combination of multiple frequency measurements. We show that we can estimate lifetimes without knowledge of the illumination signal and without any calibration. We experimentally demonstrate this blind, reference-free method using a quantum dot solution and discuss the method’s implementation in FLI applications. Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ### Scampi: a robust approximate message-passing framework for compressive imaging - implementation - Scampi: a robust approximate message-passing framework for compressive imaging by Jean Barbier, Eric Tramel, Florent Krzakala Reconstruction of images from noisy linear measurements is a core problem in image processing, for which convex optimization methods based on total variation (TV) minimization have been the long-standing state-of-the-art. We present an alternative probabilistic reconstruction procedure based on approximate message-passing, Scampi, which operates in the compressive regime, where the inverse imaging problem is underdetermined. While the proposed method is related to the recently proposed GrAMPA algorithm of Borgerding, Schniter, and Rangan, we further develop the probabilistic approach to compressive imaging by introducing an expectation-maximizaiton learning of model parameters, making the Scampi robust to model uncertainties. Additionally, our numerical experiments indicate that Scampi can provide reconstruction performance superior to both GrAMPA as well as convex approaches to TV reconstruction. Finally, through exhaustive best-case experiments, we show that in many cases the maximal performance of both Scampi and convex TV can be quite close, even though the approaches are a prori distinct. The theoretical reasons for this correspondence remain an open question. Nevertheless, the proposed algorithm remains more practical, as it requires far less parameter tuning to perform optimally. The implementation is here: https://github.com/jeanbarbier/scampi Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ## Monday, November 23, 2015 ### Machine Learning Summer School 2016, University of Cadiz, Spain, May 11th to 21st 2016 Joan just sent me the following: Dear Igor, I hope you are doing well. I just wanted to let you know that we are organizing the upcoming machine learning summer school next summer. I was wondering if these are the sort of announcements you typically post on nuit blanche... here is the official announcement.... Best regards, Joan ============================= Dear Colleagues, We are happy to announce that applications are now open for the upcoming -------------------------------------------------------------------------- MACHINE LEARNING SUMMER SCHOOL at the University of Cadiz, Spain, May 11th to 21st 2016 http://learning.mpi-sws.org/mlss2016 -------------------------------------------------------------------------- Overview -------------------------------------------------------------------------- The machine learning summer school provides graduate students, researchers and industry professionals with an intense learning experience on theory and applications of modern machine learning. Over the course of ten days, a panel of internationally renowned experts will offer lectures and tutorials covering a diverse range of theoretical and applied topics.This time the MLSS is co-located just after AISTATS 2016, in the medieval town of Cadiz (south of Spain). Confirmed Speakers and Topics so far: ————————————————————————————————————— • Arthur Gretton (UCL Gatsby), Kernel Methods Arthur Szlam (Facebook AI Research), Deep Learning • Francis Bach (ENS), Optimization • Jonas Peters (MPI-IS), Causality • Le Song (Georgia Institute of Technology), Machine Learning for Networks • Mathias Betghe (Center Integrative Neuroscience, U Tubingen), Machine Learning for Neuroscience • Nando de Freitas (Oxford University, DeepMind), Deep Learning • Neil Lawrence (University of Sheffield), Gaussian Processes • Nicolas Le Roux (Criteo), Large Scale Machine Learning • Peter Abbeel (UC Berkeley), Deep Reinforcement Learning • Samory Kptufe (Princeton), Learning Theory • Sebastien Bubek (Microsoft Research), Bandits • Stefanie Jegelka (MIT), Submodularity • Stephane Mallat (ENS), Mathematics of Convolutional Networks • Tamara Broderick (MIT), Nonparametrics and Bayesian Statistics • Michel Besserve (MPI-IS), Practical on Machine Learning for Neuroscience • John Schulmann (UC Berkeley), Practical on Deep Reinforcement Learning • Isabel Valera (MPI-SWS), Practical on Machine Learning for Networks • Durk Kingma (U Amsterdam), Practical on Deep Learning Application process -------------------------------------------------------------------------- Applications are invited from graduate students, postdoctoral researchers and industry professionals looking to use, or already using machine learning methods in their work. This includes researchers in applied fields as well as students of machine learning itself. Prior experience is not strictly required, but helpful. A small number of travel stipends will be available. Applicants will be asked to submit a CV, a cover letter of up to 2000 characters, and a short letter of recommendation from one referee of their choice. We are also seeking to give participants a chance to discuss their own work with their peers and the speakers. Each applicant is thus invited to provide the title of a poster they would like to present at the school. For more information visit http://learning.mpi-sws.org/mlss2016/application/ Important Dates -------------------------------------------------------------------------- * Monday November 23 2015 application system opens * Sunday Jan 31 2016 DEADLINE FOR APPLICATIONS * Sunday Feb 28 2016 notification of acceptance Organizers ————————————————————————————————— Manuel Gomez Rodriguez (MPI-SWS) Joan Bruna (UC Berkeley) inquiries should be directed to mlss2016@mpi-sws.org ============================================================ Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ### PCMI Summer School, "The Mathematics of Data", June 30 – July 20, 2016, Utah Michael sent me the following last week: Hi Igor, Hope all is well with you. I wanted to let you know about a summer school the "Mathematics of Data" that Anna Gilbert, John Duchi, and I are running at the Park City Mathematics Institute next summer, June 30 to July 20. Attached is a pdf of the program announcement, and here is a link with more information: I imagine that this would be of interest to many of your followers, so would it be possible for you to advertise this program? Thanks Thanks Michael ! As far as I can tell out of the few programs, two are organized by Michael, Anna and John, there are: Here is the program for the Graduate Summer school: The Mathematics of Data The Graduate Summer School bridges the gap between a general graduate education in mathematics and the specific preparation necessary to do research on problems of current interest. In general, these students will have completed their first year, and in some cases, may already be working on a thesis. While a majority of the participants will be graduate students, some postdoctoral scholars and researchers may also be interested in attending. The main activity of the Graduate Summer School will be a set of intensive short lectures offered by leaders in the field, designed to introduce students to exciting, current research in mathematics. These lectures will not duplicate standard courses available elsewhere. Each course will consist of lectures with problem sessions. Course assistants will be available for each lecture series. The participants of the Graduate Summer School meet three times each day for lectures, with one or two problem sessions scheduled each day as well. In order to derive insight from data, one needs to perform computations, and one needs to perform statistical inference. Both of these tasks raise important and fundamental mathematical questions, especially when one considers realistic sparsity and noise properties, realistic size scales, realistic temporal properties, etc. These questions are often considered outside traditional mathematics departments, and they present challenges to the theoretical foundations of related methodological areas such as computer science and statistics. This requires revisiting traditional and novel areas of applied mathematics to determine which subset of those areas can be used to help establish the theoretical foundations of modern large-scale data analysis. Topics will include Randomized Linear Algebra, Topological Data Analysis, Theoretical Computer Science, Theoretical Statistics, Functional Analysis, Scientific computing, and Optimization. The goal of the program is to present these ideas in perspective, covering the necessary background and leading up to the recent progress and open problems. Student preparation: We seek motivated students interested in the mathematical, e.g., algorithmic and statistical, aspects of modern large-scale data analysis, including theoretically-inclined students from computer science, statistics, applied mathematics, and related areas. Though familiarity with some of the topics listed above would be helpful, the formal prerequisites are limited to the content of standard introductory courses in linear algebra, probability, and optimization. The 26th Annual PCMI Summer Session will be held June 30 – July 20, 2016. Click HERE to apply to the Graduate Summer School program. 2016 Organizers John Duchi, Stanford University; Anna Gilbert, University of Michigan; and Michael Mahoney, University of California, Berkeley 2016 Graduate Summer School Lecturers Petros Drineas, Rensselaer Polytechnic Institute RandNLA: Randomization in Numerical Linear Algebra The introduction of randomization in the design and analysis of algorithms for matrix computations (such as matrix multiplication, least-squares regression, the Singular Value Decomposition (SVD), etc.) over the past 15 years provided a new paradigm and a complementary perspective to traditional numerical linear algebra approaches. These novel approaches were motivated by technological developments in many areas of scientific research that permit the automatic generation of large data sets, which are often modeled as matrices. We will outline how such approaches can be used to approximately solve problems ranging from matrix multiplication and the Singular Value Decomposition (SVD) of matrices to the Column Subset Selection Problem and the CX decomposition. Application of the proposed algorithms to data analysis tasks (with a particular focus in population genetics) will also be discussed Cynthia Dwork, Microsoft Research Course description coming soon! Robert Ghrist, University of Pennsylvania Topological Data Analysis This course will cover the background, techniques, and applications of Topological Data Analysis. Beginning with an introduction to the classical tools of algebraic topology, we will progress through applications to point clouds, persistence, networks, and more, with far-ranging applications. No background in topology will be assumed. Piotr Indyk, Massachusetts Institute of Technology Recent Developments in the Sparse Fourier Transform The discrete Fourier transform (DFT) is a fundamental component of numerous computational techniques in signal processing and scientific computing. The most popular means of computing the DFT is the fast Fourier transform (FFT). However, with the emergence of big data, the “fast” in FFT is often no longer fast enough. In addition, in many applications it is hard to acquire a sufficient amount of data to compute the desired Fourier transform in the first place. The Sparse Fourier Transform (SFT) is based on the insight that many real-world signals are sparse –i.e., most of the frequencies have negligible contribution to the overall signal. SFT exploits this insight by computing a compressed Fourier transform in time proportional to the data sparsity, not the data size. Furthermore, it uses only a subset of the signal. The goal of this talk is to survey recent developments in this area and explain the basic techniques with examples and applications. Further resources are available at: http://groups.csail.mit.edu/netmit/sFFT/. Mauro Maggioni, Duke University Course description coming soon! Gunnar Martinsson, University of Colorado Course description coming soon! Kunal Talwar, Microsoft Research Course description coming soon! Roman Vershynin, University of Michigan Course description coming soon! Stephen J. Wright, University of Wisconsin Course description coming soon! Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ## Friday, November 20, 2015 ### Random Features Roundup Here is this week roundup of preprints using Random Features (or avoiding them in the related section) Deep Mean Maps Junier B. Oliva, Dougal J. Sutherland, Barnabás Póczos, Jeff Schneider The use of distributions and high-level features from deep architecture has become commonplace in modern computer vision. Both of these methodologies have separately achieved a great deal of success in many computer vision tasks. However, there has been little work attempting to leverage the power of these to methodologies jointly. To this end, this paper presents the Deep Mean Maps (DMMs) framework, a novel family of methods to non-parametrically represent distributions of features in convolutional neural network models. DMMs are able to both classify images using the distribution of top-level features, and to tune the top-level features for performing this task. We show how to implement DMMs using a special mean map layer composed of typical CNN operations, making both forward and backward propagation simple. We illustrate the efficacy of DMMs at analyzing distributional patterns in image data in a synthetic data experiment. We also show that we extending existing deep architectures with DMMs improves the performance of existing CNNs on several challenging real-world datasets. Large-Scale Approximate Kernel Canonical Correlation Analysis Weiran Wang, Karen Livescu Kernel Canonical correlation analysis (KCCA) is a fundamental method with broad applicability in statistics and machine learning. Although there exist closed-form solution to the KCCA objective by solving an$N\times N$eigenvalue system where$N$is the training set size, the computational requirements of this approach in both memory and time prohibit its usage in the large scale. Various approximation techniques have been developed for KCCA. A recently proposed approach is to first transform original inputs to a$M$-dimensional feature space using random kitchen sinks so that inner product in the feature space approximates the kernel function, and then apply linear CCA to the transformed inputs. In challenging applications, however, the dimensionality$M$of the feature space may need to be very large in order to reveal the nonlinear correlations, and then it becomes non-trivial to solve linear CCA for data matrices of very high dimensionality. We propose to use the recently proposed stochastic optimization algorithm for linear CCA and its neural-network extension to further alleviate the computation requirements of approximate KCCA. This approach allows us to run approximate KCCA on a speech dataset with$1.4$million training samples and random feature space of dimensionality$M=100000$on a normal workstation. Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces William Herlands, Andrew Wilson, Hannes Nickisch, Seth Flaxman, Daniel Neill, Wilbert van Panhuis, Eric Xing We present a scalable Gaussian process model for identifying and characterizing smooth multidimensional changepoints, and automatically learning changes in expressive covariance structure. We use Random Kitchen Sink features to flexibly define a change surface in combination with expressive spectral mixture kernels to capture the complex statistical structure. Finally, through the use of novel methods for additive non-separable kernels, we can scale the model to large datasets. We demonstrate the model on numerical and real world data, including a large spatio-temporal disease dataset where we identify previously unknown heterogeneous changes in space and time. Data Stream Classification using Random Feature Functions and Novel Method Combinations Diego Marrón, Jesse Read , Albert Bifet , Nacho Navarro Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also,$k$-nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyper-parameter options and initial conditions to be considered an effective `off-the-shelf' data-streams solution. In this work, we look at combinations of Hoeffding-trees, nearest neighbour, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power. We further extend the investigation to implementing methods on GPUs, which we test on some large real-world datasets, and show the benefits of using GPUs for data-stream learning due to their high scalability. Our empirical evaluation yields positive results for the novel approaches that we experiment with, highlighting important issues, and shed light on promising future directions in approaches to data-stream classification. The Variational Fair Autoencoder Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible. Our model is based on a variational autoencoding architecture with priors that encourage independence between sensitive and latent factors of variation. Any subsequent processing, such as classification, can then be performed on this purged latent representation. To remove any remaining dependencies we incorporate an additional penalty term based on the "Maximum Mean Discrepancy" (MMD) measure. We discuss how these architectures can be efficiently trained on data and show in experiments that this method is more effective than previous work in removing unwanted sources of variation while maintaining informative latent representations. Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output. This learned operator replaces the multivariate integral required in classical EP, which may not have an analytic expression. We use kernel-based regression, which is trained on a set of probability distributions representing the incoming messages, and the associated outgoing messages. The kernel approach has two main advantages: first, it is fast, as it is implemented using a novel two-layer random feature representation of the input message distributions; second, it has principled uncertainty estimates, and can be cheaply updated online, meaning it can request and incorporate new training data when it encounters inputs on which it is uncertain. In experiments, our approach is able to solve learning problems where a single message operator is required for multiple, substantially different data sets (logistic regression for a variety of classification problems), where the ability to accurately assess uncertainty and to efficiently and robustly update the message operator are essential. Related: Thoughts on Massively Scalable Gaussian Processes Andrew Gordon Wilson, Christoph Dann, Hannes Nickisch We introduce a framework and early results for massively scalable Gaussian processes (MSGP), significantly extending the KISS-GP approach of Wilson and Nickisch (2015). The MSGP framework enables the use of Gaussian processes (GPs) on billions of datapoints, without requiring distributed inference, or severe assumptions. In particular, MSGP reduces the standard$O(n^3)$complexity of GP learning and inference to$O(n)$, and the standard$O(n^2)$complexity per test point prediction to$O(1)$. MSGP involves 1) decomposing covariance matrices as Kronecker products of Toeplitz matrices approximated by circulant matrices. This multi-level circulant approximation allows one to unify the orthogonal computational benefits of fast Kronecker and Toeplitz approaches, and is significantly faster than either approach in isolation; 2) local kernel interpolation and inducing points to allow for arbitrarily located data inputs, and$O(1)$test time predictions; 3) exploiting block-Toeplitz Toeplitz-block structure (BTTB), which enables fast inference and learning when multidimensional Kronecker structure is not present; and 4) projections of the input space to flexibly model correlated inputs and high dimensional data. The ability to handle many ($m \approx n$) inducing points allows for near-exact accuracy and large scale kernel learning. Efficient Multiscale Gaussian Process Regression using Hierarchical Clustering Z. Zhang, K. Duraisamy, N. A. Gumerov Standard Gaussian Process (GP) regression, a powerful machine learning tool, is computationally expensive when it is applied to large datasets, and potentially inaccurate when data points are sparsely distributed in a high-dimensional feature space. To address these challenges, a new multiscale, sparsified GP algorithm is formulated, with the goal of application to large scientific computing datasets. In this approach, the data is partitioned into clusters and the cluster centers are used to define a reduced training set, resulting in an improvement over standard GPs in terms of training and evaluation costs. Further, a hierarchical technique is used to adaptively map the local covariance representation to the underlying sparsity of the feature space, leading to improved prediction accuracy when the data distribution is highly non-uniform. A theoretical investigation of the computational complexity of the algorithm is presented. The efficacy of this method is then demonstrated on simple analytical functions and on data from a direct numerical simulation of turbulent combustion. A Hierarchical Spectral Method for Extreme Classification Paul Mineiro, Nikos Karampatziakis Extreme classification problems are multiclass and multilabel classification problems where the number of outputs is so large that straightforward strategies are neither statistically nor computationally viable. One strategy for dealing with the computational burden is via a tree decomposition of the output space. While this typically leads to training and inference that scales sublinearly with the number of outputs, it also results in reduced statistical performance. In this work, we identify two shortcomings of tree decomposition methods, and describe two heuristic mitigations. We compose these with a novel eigenvalue technique for constructing the tree which is essentially hierarchical orthonormal partial least squares. The end result is a computationally efficient algorithm that provides good statistical performance on several extreme data sets. Image Credit: NASA/JPL-Caltech This image was taken by Navcam: Left B (NAV_LEFT_B) onboard NASA's Mars rover Curiosity on Sol 1168 (2015-11-19 08:40:38 UTC). Full Resolution Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ## Wednesday, November 18, 2015 ### Paris Machine Learning Meetup #3 Season 3: Zero Resource Speech Challenge, Epidemium, Smartsubs, TensorFlow, ML Use Here is the program, more information will be added as we get closer to tonight's meetup at Criteo 5a big thank you to them for hosting and for the networking event afterwards): + Nicolas Le Roux, Criteo presentation + Emmanuel Dupoux, ENS Ulm, A report on the Zero Resource Speech Challenge (INTERSPEECH 2015) Speech recognition technologies rely on supervised learning techniques which typically require thousands of hours of speech annotated by experts in order to train acoustic and language models. Here, I'll present the results of the Zero Resource Speech Challenge (INTERSPEECH 2015) which explores a radically alternative approach: infant-inspired unsupervised discovery of linguistic structure based on raw speech. Epidemium est un programme de recherche scientifique, participatif et ouvert à tous, dédié à la compréhension du cancer grâce aux Open Big Data. Initié par le laboratoire Roche et le laboratoire communautaire La Paillasse, ce programme se concrétisera par un data challenge : le Challenge4Cancer. + Rebiha Rahba, "Identification des ""Trending Topics"" ou comment utiliser le Machine Learning pour identifier les sujets qui font l'actualité ? Présentation d'une étude Machine Learning, de la problématique jusque la création d'une solution prête à l'emploi. La solution à entièrement été construite sur Azure Machine Learning. La présentation montrera comment ce produit permet de réaliser ce type d'étude. + Imad Soltani, "Tensorflow, some trials and the open source deep learning framework ecosystem" + Félix Revert, "Du Machine Learning dans l'éducation des langues étrangères ? C'est tout le fond de SmartSubs : les smart subtitles. SmartSubs évalue le niveau d'anglais d'une personne puis identifie tout le vocabulaire que cette personne ne connaît pas. Ce vocabulaire est présent dans du texte qui peut être: • des sous-titres de films ou série • des articles de journaux • des ebooks Comment détectons nous le vocabulaire difficile pour chaque personne ? Grâce à du machine learning, nous testons sur un petit échantillon de personnes et nous étendons nos résultats sur tous nos utilisateurs. 1er projet : les smart subtitles http://youtu.be/kmEG7wdL9RY 2nd projet : la lecture smart sur internet Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ## Tuesday, November 17, 2015 ### Random sampling of bandlimited signals on graphs Random sampling of bandlimited signals on graphs by Gilles Puy, Nicolas Tremblay, Rémi Gribonval, Pierre Vandergheynst We study the problem of sampling k-bandlimited signals on graphs. We propose two sampling strategies that consist in selecting a small subset of nodes at random. The first strategy is non-adaptive, i.e., independent of the graph structure, and its performance depends on a parameter called the graph coherence. On the contrary, the second strategy is adaptive but yields optimal results. Indeed, no more than O(k log(k)) measurements are sufficient to ensure an accurate and stable recovery of all k-bandlimited signals. This second strategy is based on a careful choice of the sampling distribution, which can be estimated quickly. Then, we propose a computationally efficient decoder to reconstruct k-bandlimited signals from their samples. We prove that it yields accurate reconstructions and that it is also stable to noise. Finally, we conduct several experiments to test these techniques. Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin. ### Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach by Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or approximately sparse. The fact that both primal and dual solutions are sparse motivates us to develop a randomized approach for a general convex-concave optimization problem. Specifically, the proposed approach combines the strength of random projection with that of sparse learning: it utilizes random projection to reduce the dimensionality, and introduces$\ell_1\$-norm regularization to alleviate the approximation error caused by random projection. Theoretical analysis shows that under favored conditions, the randomized algorithm can accurately recover the optimal solutions to the convex-concave optimization problem (i.e., recover both the primal and dual solutions). Furthermore, the solutions returned by our algorithm are guaranteed to be approximately sparse.

Image Credit: NASA/JPL-Caltech/Space Science Institute
W00094976.jpg was taken on November 15, 2015 and received on Earth November 16, 2015. The camera was pointing toward UNK, and the image was taken using the CB2 and CL2 filters.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Sunday, November 15, 2015

### Sunday Morning Insight: The Hard Questions

I wanted to talk about something else today but the events that occured on Friday have made me ponder about the following questions:
• Are there simple means of interdicting the use of high powered weapons in crowded areas or cities ?
• Are there cheap and rapid means of detecting high explosives from a distance ?
• Are there cheap and rapid means of interdicting the use of high explosives ?
• Are there better technical means of enhancing monitoring capabilities in group hostage situations ?
• Are there better ways of enabling citizen responses to emergencies without giving information ?
• Are there efficient and privacy conscious ways of monitoring the activities of a few thousand people deemed potential threats without overburdening security forces ?
All answers should have little impact on societies (financial, privacy). Some of these questions may look like they are impossible to answer affirmatively but we need to ask them nonetheless and we need to have places where these questions can be asked in a naive fashion. This type of exercise is still undertaken by scientists in the shape of the JASONs in the US and maybe we should have something similar here.
Current organizations and structures which should deal with these issues seem out of touch with Science based approaches. That needs to change.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Saturday, November 14, 2015

### Saturday Morning Videos: Videos and slides for Speech and Audio in the Northeast (SANE) 2015

Whatever the source of information today, it is likely to put all of us in a never ending source of sadness and distress. We will not forget. It is important however that we have some altternatives to the information poor media circus unraveling all around us.  Here is one alernative below. Other alternatives include all the other Saturday Morning Videos entries.

Before the events, Jonathan Le Roux sent me the following

Hi Igor,

I thought Nuit Blanche readers might be interested to know that they can now binge-watch the videos and slides for the Speech and Audio in the Northeast (SANE) 2015 workshop that Hank Liao and I organized at Google NY last month.

The talks were focused on the latest advances in deep learning methods for speech and audio, with applications from automatic speech recognition to audio source separation and audio event detection:
• Ron Weiss (Google) - "Training neural network acoustic models on (multichannel) waveforms"
• Tuomas Virtanen (Tampere University of Technology) - "Sound event detection in realistic environments using multilabel deep neural networks"
• John Hershey (MERL) - "Deep clustering: discriminative embeddings for single-channel separation of multiple sources"
• Pablo Sprechmann (NYU) - "Deep learning for solving inverse problems"
• Michael Mandel (Brooklyn College, CUNY) - "Multichannel Spatial Clustering for Far Field Speech Recognition in Mismatched Conditions"
• Paris Smaragdis (UIUC) - "NMF? Neural Nets? It’s all the same..."

Links to the talks and slides are on the workshop website (http://www.saneworkshop.org).
Here is a YouTube playlist with all the talks:

Best,
Jonathan

Thanks  Jonathan, from the page:

#### Thursday, October 22   [Watch all talks on YouTube]

Click on the talk title to jump to the abstract and bio, and on Poster Session for the list of posters.
 8:30-9:10 Registration and Breakfast 9:10-9:15 Welcome [YouTube] [Slides] 9:15-10:00 Ron Weiss (Google) [YouTube] [Slides] "Training neural network acoustic models on (multichannel) waveforms" 10:00-10:45 Tuomas Virtanen (Tampere University of Technology) [YouTube] [Slides] "Sound event detection in realistic environments using multilabel deep neural networks" 10:45-11:15 Coffee Break 11:15-12:00 John Hershey (MERL) [YouTube] [Slides] "Deep clustering: discriminative embeddings for single-channel separation of multiple sources" 12:00-12:45 Pablo Sprechmann (NYU) [YouTube] [Slides] "Deep learning for solving inverse problems" 12:45-1:00 Poster Setup 1:00-1:45 Lunch (Cafeteria) 1:45-3:30 Poster Session 3:30-4:15 Rohit Prasad (Amazon) "Spoken Language Understanding for Amazon Echo" 4:15-5:00 Michael Mandel (Brooklyn College, CUNY) [YouTube] [Slides] "The 2015 Jelinek Workshop on Speech and Language Technology" 5:00-5:45 Paris Smaragdis (UIUC) [YouTube] [Slides] "NMF? Neural Nets? It’s all the same..." 5:45-6:00 Closing remarks 6:00-... Drinks somewhere nearby

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.