Nuit Blanche: 02/01/2017

Tuesday, February 28, 2017

Deep Learning to Hash: HashNet and DHN

Using Deep Learning to learn to hash, the Great Convergence continues. Here are two recent examples:

HashNet: Deep Learning to Hash by Continuation by Zhangjie Cao, Mingsheng Long, Jianmin Wang, Philip S. Yu

Learning to hash has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval, due to its computation efficiency and retrieval quality. Deep learning to hash, which improves retrieval quality by end-to-end representation learning and hash encoding, has received increasing attention recently. Subject to the vanishing gradient difficulty in the optimization with binary activations, existing deep learning to hash methods need to first learn continuous representations and then generate binary hash codes in a separated binarization step, which suffer from substantial loss of retrieval quality. This paper presents HashNet, a novel deep architecture for deep learning to hash by continuation method, which learns exactly binary hash codes from imbalanced similarity data where the number of similar pairs is much smaller than the number of dissimilar pairs. The key idea is to attack the vanishing gradient problem in optimizing deep networks with non-smooth binary activations by continuation method, in which we begin from learning an easier network with smoothed activation function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, deep network with the sign activation function. Comprehensive empirical evidence shows that HashNet can generate exactly binary hash codes and yield state-of-the-art multimedia retrieval performance on standard benchmarks.

Deep Hashing Network for Efficient Similarity Retrieval by Han Zhu, Mingsheng Long, Jianmin Wang, Yue Cao

Due to the storage and retrieval efficiency, hashing has been widely deployed to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing, which improves the quality of hash coding by exploiting the semantic similarity on data pairs, has received increasing attention recently. For most existing supervised hashing methods for image retrieval, an image is first represented as a vector of hand-crafted or machine-learned features, followed by another separate quantization step that generates binary codes. However, suboptimal hash coding may be produced, because the quantization error is not statistically minimized and the feature representation is not optimally compatible with the binary coding. In this paper, we propose a novel Deep Hashing Network (DHN) architecture for supervised hashing, in which we jointly learn good image representation tailored to hash coding and formally control the quantization error. The DHN model constitutes four key components: (1) a sub-network with multiple convolution-pooling layers to capture image representations; (2) a fully-connected hashing layer to generate compact binary hash codes; (3) a pairwise cross-entropy loss layer for similarity-preserving learning; and (4) a pairwise quantization loss for controlling hashing quality. Extensive experiments on standard image retrieval datasets show the proposed DHN model yields substantial boosts over latest state-of-the-art hashing methods.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

The Power of Sparsity in Convolutional Neural Networks

The Power of Sparsity in Convolutional Neural Networks by Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov

Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, February 27, 2017

Ce soir: Paris Machine Learning, Hors serie #7 Season 4: Machine Learning for Arts, Gene Kogan

The video streaming of the event is here:

Tonight we will have a special hors série (Hors serie #7 Season 4) on Machine Learning for Arts with Gene Kogan. To register, please go here. The event will be hosted and sponsored by our good friends at Mobiskill

Gene Kogan

abstracts:

This talk will examine the most recent wave of artistic, creative, and humorous projects applying machine learning in various domains, producing troves of machine-hallucinated text, images, sounds, and video, demonstrating an affinity to imitating human style and sensibility.
These experimental works attempt to show the capacity of these machines for producing aesthetically and culturally meaningful art, while also challenging them to illuminate their most obscure and counterintuitive properties.
Additionally, a series of in-progress educational materials by the speaker will be shown, including demos, code samples, artistic works, and explanatory writings about the topic.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

A Random Matrix Approach to Neural Networks

A Random Matrix Approach to Neural Networks by Cosme Louart, Zhenyu Liao, Romain Couillet

This article studies the Gram random matrix model G=1TΣTΣ, Σ=σ(WX), classically found in random neural networks, where X=[x1,…,xT]∈Rp×T is a (data) matrix of bounded norm, W∈Rn×p is a matrix of independent zero-mean unit variance entries, and σ:R→R is a Lipschitz continuous (activation) function --- σ(WX) being understood entry-wise. We prove that, as n,p,T grow large at the same rate, the resolvent Q=(G+γIT)−1, for γ>0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Φ=TnE[G], which provides in passing a deterministic equivalent for the empirical spectral measure of G. This result, established by means of concentration of measure arguments, enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.

Reproducibility: Python 3 codes used to produce the results of Section 4 are available at https://github.com/Zhenyu-LIAO/RMT4ELM

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, February 25, 2017

Saturday Morning Videos: Interactive Learning workshop, Simons Institute, Berkeley

The Simons Institute at Berkeley sponsored a workshop on Interactive Learning the week before last. The videos are in. Thank you to the organizers Nina Balcan, Emma Brunskill, Robert Nowak , Andrea Thomaz . Here is the introduction to the workshop:

Interactive learning is a modern machine learning paradigm of significant practical and theoretical interest, where the algorithm and the domain expert engage in a two-way dialog to facilitate more accurate learning from less data compared to the classical approach of passively observing labeled data. This workshop will explore several topics related to interactive learning broadly defined, including active learning, in which the learner chooses which examples it wants labeled; explanation-based learning, in which the human doesn't merely tell the machine whether its predictions are right or wrong, but provides reasons in a form that is meaningful to both parties; crowdsourcing, in which labels and other information are solicited from a gallery of amateurs; teaching and learning from demonstrations, in which a party that knows the concept being learned provides helpful examples or demonstrations; and connections and applications to recommender systems, automated tutoring and robotics. Key questions we will explore include what are the right learning models in each case, what are the demands on the learner and the human interlocutor, and what kinds of concepts and other structures can be learned. A main goal of the workshop is to foster connections between theory/algorithms and practice/applications.

Videos:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, February 24, 2017

The ICLR2017 program is out

ICLR2017 just released their program ( the open review for the Workshop site is open and here)

Monday April 24, 2017

Morning Session

8.45 - 9.00 Opening Remarks

9.00 - 9.40 Invited talk 1: Eero Simoncelli

9.40 - 10.00 Contributed talk 1: End-to-end Optimized Image Compression

10.00 - 10.20 Contributed talk 2: Amortised MAP Inference for Image Super-resolution

10.00 - 10.30 Coffee Break

10.30 - 12.30 Poster Session 1

12.30 - 14.30 Lunch provided by ICLR

Afternoon Session

14.30 - 15.10 Invited talk 2: Benjamin Recht

15.10 - 15.30 Contributed Talk 3: Understanding deep learning requires rethinking generalization - BEST PAPER AWARD

15.30 - 15.50 Contributed Talk 4: Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

15.50 - 16.10 Contributed Talk 5: Towards Principled Methods for Training Generative Adversarial Networks

16.10 - 16.30 Coffee Break

16.30 - 18.30 Poster Session 2

Tuesday April 25, 2017

Afternoon Session

9.00 - 9.40 Invited talk 1: Chloe Azencott

9.40 - 10.00 Contributed talk 1: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data - BEST PAPER AWARD

10.00 - 10.20 Contributed talk 2: Learning Graphical State Transitions

10.20 - 10.30 Coffee Break

10.30 - 12.30 Poster Session 1

12.30 - 14.30 Lunch provided by ICLR

Afternoon Session

14.30 - 15.10 Invited talk 2: Riccardo Zecchina

15.10 - 15.30 Contributed Talk 3: Learning to Act by Predicting the Future

15.30 - 15.50 Contributed Talk 4: Reinforcement Learning with Unsupervised Auxiliary Tasks

15.50 - 16.10 Contributed Talk 5: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

16.10 - 16.30 Coffee Break

16.30 - 18.30 Poster Session 2

19.00 - 21.00 Gala dinner offered by ICLR

Wednesday April 26, 2017

Morning Session

9.00 - 9.40 Invited talk 1: Regina Barzilay

9.40 - 10.00 Contributed talk 1: Learning End-to-End Goal-Oriented Dialog

10.00 - 10.20 Contributed talk 2: Multi-Agent Cooperation and the Emergence of (Natural) Language

10.00 - 10.30 Coffee Break

10.30 - 12.30 Poster Session 1

12.30 - 14.30 Lunch provided by ICLR

Afternoon Session

14.30 - 15.10 Invited talk 2: Alex Graves

15.10 - 15.30 Contributed Talk 3: Making Neural Programming Architectures Generalize via Recursion - BEST PAPER AWARD

15.30 - 15.50 Contributed Talk 4: Neural Architecture Search with Reinforcement Learning

15.50 - 16.10 Contributed Talk 5: Optimization as a Model for Few-Shot Learning

16.10 - 16.30 Coffee Break

16.30 - 18.30 Poster Session 2

Credit photo: Par BaptisteMPM — Travail personnel, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=37629070

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

The Rare Eclipse Problem on Tiles: Quantised Embeddings of Disjoint Convex Sets

Here is some analysis for the quantised compressive classification problem.

The Rare Eclipse Problem on Tiles: Quantised Embeddings of Disjoint Convex Sets by Valerio Cambareri, Chunlei Xu, Laurent Jacques

Quantised random embeddings are an efficient dimensionality reduction technique which preserves the distances of low-complexity signals up to some controllable additive and multiplicative distortions. In this work, we instead focus on verifying when this technique preserves the separability of two disjoint closed convex sets, i.e., in a quantised view of the "rare eclipse problem" introduced by Bandeira et al. in 2014. This separability would ensure exact classification of signals in such sets from the signatures output by this non-linear dimensionality reduction. We here present a result relating the embedding's dimension, its quantiser resolution and the sets' separation, as well as some numerically testable conditions to illustrate it. Experimental evidence is then provided in the special case of two ℓ2-balls, tracing the phase transition curves that ensure these sets' separability in the embedded domain.