## Saturday, April 30, 2016

### Saturday Morning Video: "Can the brain do back-propagation?" - Geoffrey Hinton of Google & University of Toronto

Geoffrey Hinton wonders if the brain can do backpropagation:
h/t Andrej

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Saturday Morning Videos: Mars, Falcon landing and "A Beautiful Planet"

An animation tracking Curiosity’s path on Mars, along Bridger Basin while looking at the Bob Marshall Ridge. Composed in Blender from 41 Navcam greyscale/xyz image pairs taken by the rover at sols 1094 to 1108. Conversion done with the Navcam importer which is available at: github.com/phaseIV/Blender-Navcam-Importer Following Curiosity from Captain Video on Vimeo.
360 View, First Stage Landing on Droneship for Falcon 9

Window to Earth: NASA Partners with IMAX for 'A Beautiful Planet'
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, April 29, 2016

### Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex / Memory and Information Processing in Recurrent Neural Networks

Here are two interesting papers on Recurrent Neural Networks. Interestingly, here is an excerpt from the second preprint:

However, here we see that even without the sparse input assumption and L1 optimization in the output (a computationally expensive optimization used in compressive sensing) the network can achieve capacity greater than its degrees of freedom N.
Enjoy !

Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex by Qianli Liao, Tomaso Poggio

We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex. We begin with the observation that a shallow RNN is exactly equivalent to a very deep ResNet with weight sharing among the layers. A direct implementation of such a RNN, although having orders of magnitude fewer parameters, leads to a performance similar to the corresponding ResNet. We propose 1) a generalization of both RNN and ResNet architectures and 2) the conjecture that a class of moderately deep RNNs is a biologically-plausible model of the ventral stream in visual cortex. We demonstrate the effectiveness of the architectures by testing them on the CIFAR-10 dataset.

Memory and Information Processing in Recurrent Neural Networks by Alireza Goudarzi, Sarah Marzen, Peter Banda, Guy Feldman, Christof Teuscher, Darko Stefanovic
Recurrent neural networks (RNN) are simple dynamical systems whose computational power has been attributed to their short-term memory. Short-term memory of RNNs has been previously studied analytically only for the case of orthogonal networks, and only under annealed approximation, and uncorrelated input. Here for the first time, we present an exact solution to the memory capacity and the task-solving performance as a function of the structure of a given network instance, enabling direct determination of the function--structure relation in RNNs. We calculate the memory capacity for arbitrary networks with exponentially correlated input and further related it to the performance of the system on signal processing tasks in a supervised learning setup. We compute the expected error and the worst-case error bound as a function of the spectra of the network and the correlation structure of its inputs and outputs. Our results give an explanation for learning and generalization of task solving using short-term memory, which is crucial for building alternative computer architectures using physical phenomena based on the short-term memory principle.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, April 28, 2016

### Video: Sparse Identification of Nonlinear Dynamics (SINDy)

Hi Igor,

I am attaching a link to a youtube video abstract of our recent paper on sparse identification of nonlinear dynamics (SINDy) in PNAS.  Hope you enjoy, and please feel free to share with anyone who may be interested.

Also, I saw that you mentioned our algorithm on your blog — thanks very much!!  It is awesome to hear the others like the work.

Paper [open access]:  http://www.pnas.org/content/113/15/3932.abstract

Best Regards,
Steve
Thanks  Steve ! Here is the video:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Agnostic Estimation of Mean and Covariance

The reason we have a zoo of matrix factorizations stems in part with the need to deal with different adversarial noises. From the paper:

The Achilles heel of algorithms for generative models is the assumption that data is exactly from the model. This is crucial for known guarantees, and relaxations of it are few and specialized, e.g., in ICA, data could by noisy, but the noise itself is assumed to be Gaussian. Assumptions about rank and sparsity are made in a technique that is now called Robust PCA [CSPW11, CLMW11]. There have been attempts [Kwa08, MT+11] at achieving robustness by L1 minimization, but they don’t give any error bounds on the output produced. A natural, important and wide open problem is estimating the parameters of generative models in the presence of arbitrary, i.e., malicious noise, a setting usually referred to as agnostic learning. The simplest version of this problem is to estimate a single Gaussian in the presence of malicious noise. Alternatively, this can be posed as the problem of finding a best-fit Gaussian to data or agnostically learning a single Gaussian.

Agnostic Estimation of Mean and Covariance  by Kevin A. Lai, Anup B. Rao, Santosh Vempala

We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $\eta$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (or finding the best-fit Gaussian) when $\eta$ fraction of data is adversarially corrupted, agnostically learning a mixture of Gaussians, agnostic ICA, etc. We present polynomial-time algorithms to estimate the mean and covariance with error guarantees in terms of information-theoretic lower bounds. As a corollary, we also obtain an agnostic algorithm for Singular Value Decomposition.

and previously the Recursive Fourier PCA algorithm

Max vs Min: Tensor Decomposition and ICA with nearly Linear Sample Complexity by Santosh S. Vempala, Ying Xiao

We present a simple, general technique for reducing the sample complexity of matrix and tensor decomposition algorithms applied to distributions. We use the technique to give a polynomial-time algorithm for standard ICA with sample complexity nearly linear in the dimension, thereby improving substantially on previous bounds. The analysis is based on properties of random polynomials, namely the spacings of an ensemble of polynomials. Our technique also applies to other applications of tensor decompositions, including spherical Gaussian mixture models.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, April 27, 2016

### A Tutorial on Libra: R package for the Linearized Bregman Algorithm in High Dimensional Statistics - implementation -

Here is an implementation in R of an l1 solver. All compressive sensing solvers implementation show up under the CS and implemntation tags.

A Tutorial on Libra: R package for the Linearized Bregman Algorithm in High Dimensional Statistics by Jiechao Xiong, Feng Ruan, Yuan Yao

The R package, Libra, stands for the LInearized BRegman Al- gorithm in high dimensional statistics. The Linearized Bregman Algorithm is a simple iterative procedure to generate sparse regularization paths of model estimation, which are rstly discovered in applied mathematics for image restoration and particularly suitable for parallel implementation in large scale problems. The limit of such an algorithm is a sparsity-restricted gradient descent ow, called the Inverse Scale Space, evolving along a par- simonious path of sparse models from the null model to over tting ones. In sparse linear regression, the dynamics with early stopping regularization can provably meet the unbiased Oracle estimator under nearly the same condition as LASSO, while the latter is biased. Despite their successful applications, statistical consistency theory of such dynamical algorithms remains largely open except for some recent progress on linear regression. In this tutorial, algorithmic implementations in the package are discussed for several widely used sparse models in statistics, including linear regression, logistic regres- sion, and several graphical models (Gaussian, Ising, and Potts). Besides the simulation examples, various application cases are demonstrated, with real world datasets from diabetes, publications of COPSS award winners, as well as social networks of two Chinese classic novels, Journey to the West and Dream of the Red Chamber.

The packahe is here: https://cran.r-project.org/web/packages/Libra/index.html

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### An information theoretic formulation of the Dictionary Learning and Sparse Coding Problems on Statistical Manifolds

An information theoretic formulation of the Dictionary Learning and Sparse Coding Problems on Statistical Manifolds by Rudrasis Chakraborty, Monami Banerjee, Victoria Crawford, Baba C. Vemuri

In this work, we propose a novel information theoretic framework for dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the manifold of probability distributions). Unlike the traditional DL and SC framework, our new formulation {\it does not explicitly incorporate any sparsity inducing norm in the cost function but yet yields SCs}. Moreover, we extend this framework to the manifold of symmetric positive definite matrices, $\mathcal{P}_n$. Our algorithm approximates the data points, which are probability distributions, by the weighted Kullback-Leibeler center (KL-center) of the dictionary atoms. The KL-center is the minimizer of the maximum KL-divergence between the unknown center and members of the set whose center is being sought. Further, {\it we proved that this KL-center is a sparse combination of the dictionary atoms}. Since, the data reside on a statistical manifold, the data fidelity term can not be as simple as in the case of the vector-space data. We therefore employ the geodesic distance between the data and a sparse approximation of the data element. This cost function is minimized using an acceleterated gradient descent algorithm. An extensive set of experimental results show the effectiveness of our proposed framework. We present several experiments involving a variety of classification problems in Computer Vision applications. Further, we demonstrate the performance of our algorithm by comparing it to several state-of-the-art methods both in terms of classification accuracy and sparsity.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, April 26, 2016

### Paris Machine Learning Meetup #12 Season 3: ML Hardware

Mozilla France is hosting us. Mathworks is sponsoring the networking event afterwards. Thank you to them.

Hardware is becoming key in getting computations performed at scale and using specific techniques such as Deep Learning. Today's Paris Machine Learning meetup is focused on the type of hardware and software combination that is currently and may eventually run this Machine Learning/Deep Learning/AI infrastructure.
Streaming video is below:

Right now, these are the presentations we should have tonight. Undoubtedly, we should have more meetups on this topic in the future.
Arjun Bansal, Nervana Systems, Nervana and the Future of Computing
"Nervana delivers fast, scalable AI on demand using cloud computing, deep learning and custom hardware. These trends have recently been identified as the basis for the future of computing in a recent issue of The Economist. In this talk, I will provide an overview of Nervana’s technology."

Marc Wolff, Amine El Helou , Mathworks, Un algorithme distribué de forêt aléatoire pour du risque de crédit/ A Random Forest distributed algorithm for credit risk computations.
L'algorithme Random Forest consiste à entraîner, généralement en parallèle, plusieurs arbres de décision. Une des limitations bien connues de cette méthode est la nécessité de charger voire de répliquer l’intégralité du dataset en mémoire. Dans le cas d’importantes volumétries de données, cela peut se révéler problématique. Une alternative possible consiste à développer un algorithme distribué d’arbre décisionnel en s’appuyant sur du parallélisme de type SPMD (Single Program Multiple Data) et sur l’API MPI (Message Passing Interface). Cette approche permet d’exploiter au mieux toute la puissance de traitement (machine multi-cœurs ou cluster de calcul) et d’opérer sur des données distribuées en mémoire. Nous présenterons une implémentation de ce type et son application à une analyse de risque de crédit auprès d'un groupe bancaire (Groupe de Risque Opérationnel du Crédit Agricole).

Olivier Guillaune,  Any computer for my machine learning?
"When the learning phase begin to be too long, what are the current material solutions to accelerate my calculations and which elements to assemble a reliable calculation server, powerful and at the best cost ?"

Igor Carron, LightOn.io, Approximating kernels at the speed of light

We will talk about how we use optics to perform some specific operation of interest in Machine Learning. The basis of this talk relies on this preprint.

André Reinald, Mozilla

En introduction, nous expliquerons que le lien entre machine learning et Mozilla ce sont les données personnelles, auxquelles nous accordons toute notre attention, et que nous aimerions voir traitées avec une certaine éthique. On parlera de trois sujets:
1. retour sur les tentatives dans les "content services",
2. projets dans les objets connectés (limités à la domotique pour le moment),
3. projet peerstorage.org

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, April 25, 2016

### Sketching and Neural Networks

Two subjects we used to not see in the same field. Interesting!

Sketching and Neural Networks by Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

High-dimensional sparse data present computational and statistical challenges for supervised learning. We propose compact linear sketches for reducing the dimensionality of the input, followed by a single layer neural network. We show that any sparse polynomial function can be computed, on nearly all sparse binary vectors, by a single layer neural network that takes a compact sketch of the vector as input. Consequently, when a set of sparse binary vectors is approximately separable using a sparse polynomial, there exists a single-layer neural network that takes a short sketch as input and correctly classifies nearly all the points. Previous work has proposed using sketches to reduce dimensionality while preserving the hypothesis class. However, the sketch size has an exponential dependence on the degree in the case of polynomial classifiers. In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree. Even in the linear case, our approach allows us to improve on the pesky $O({1}/{{\gamma}^2})$ dependence of random projections, on the margin $\gamma$. We empirically show that our approach leads to more compact neural networks than related methods such as feature hashing at equal or better performance.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, April 21, 2016

### Compact all-CMOS spatiotemporal compressive sensing video camera with pixel-wise coded exposure

Some new Compressive Sensing hardware, woohoo !

Compact all-CMOS spatiotemporal compressive sensing video camera with pixel-wise coded exposure by Jie Zhang, Tao Xiong, Trac Tran, Sang Chin, and Ralph Etienne-Cummings
Abstract
We present a low power all-CMOS implementation of temporal compressive sensing with pixel-wise coded exposure. This image sensor can increase video pixel resolution and frame rate simultaneously while reducing data readout speed. Compared to previous architectures, this system modulates pixel exposure at the individual photo-diode electronically without external optical components. Thus, the system provides reduction in size and power compare to previous optics based implementations. The prototype image sensor (127 × 90 pixels) can reconstruct 100 fps videos from coded images sampled at 5 fps. With 20× reduction in readout speed, our CMOS image sensor only consumes 14μW to provide 100 fps videos.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.