Monday, November 02, 2015

Circulants in Neural Networks, Randomized Skeleton, Quantization based Fast Inner Product Search, Fast Binary Embedding, Quantization based Fast Inner Product Search

Sanjiv Kumar and co-authors have recently released a number of interesting papers that are at the crossroad of many subjects areas mentioned on Nuit Blanche. Here a few of them starting with replacing the unstructured matrces used in neural networks with circulant projections or Kronecker products, an item oft discussed in compressive sensing and particularly compressive imaging. 

An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections by Yu Cheng, Felix X. Yu Rogerio S. Feris, Sanjiv Kumar, Alok Choudhary, Shih-Fu Chang

We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection. The circulant structure substantially reduces memory footprint and enables the use of the Fast Fourier Transform to speed up the computation. Considering a fully-connected neural network layer with d input nodes, and d output nodes, this method improves the time complexity from O(d2) to O(dlogd)and space complexity from O(d2) to O(d). The space savings are particularly important for modern deep convolutional neural network architectures, where fully-connected layers typically contain more than 90% of the network parameters. We further show that the gradient computation and optimization of the circulant projections can be performed very efficiently. Our experiments on three standard datasets show that the proposed approach achieves this significant gain in storage and efficiency with minimal increase in error rate compared to neural networkswith unstructured projections
Fast Online Clustering with Randomized Skeleton Sets
Krzysztof Choromanski, Sanjiv Kumar, Xiaofeng Liu

We present a new fast online clustering algorithm that reliably recovers arbitrary-shaped data clusters in high throughout data streams. Unlike the existing state-of-the-art online clustering methods based on k-means or k-medoid, it does not make any restrictive generative assumptions. In addition, in contrast to existing nonparametric clustering techniques such as DBScan or DenStream, it gives provable theoretical guarantees. To achieve fast clustering, we propose to represent each cluster by a skeleton set which is updated continuously as new data is seen. A skeleton set consists of weighted samples from the data where weights encode local densities. The size of each skeleton set is adapted according to the cluster geometry. The proposed technique automatically detects the number of clusters and is robust to outliers. The algorithm works for the infinite data stream where more than one pass over the data is not feasible. We provide theoretical guarantees on the quality of the clustering and also demonstrate its advantage over the existing state-of-the-art on several datasets.

Fast Orthogonal Projection Based on Kronecker Product 
 Xu Zhang , Felix X. Yu , Ruiqi Guo , Sanjiv Kumar, Shengjin Wang , Shih-Fu Chang

We propose a family of structured matrices to speed up orthogonal projections for high-dimensional data commonly seen in computer vision applications. In this, a structured matrix is formed by the Kronecker product of a series of smaller orthogonal matrices. This achieves O ( d log d ) computational complexity and O (log d ) space complexity for d-dimensional data, a drastic improvement over the standard unstructured projections whose compu- tational and space complexities are both O ( d 2 ) . We also introduce an efficient learning procedure for optimizing such matrices in a data dependent fashion. We demonstrate the significant advantages of the proposed approach in solving the approximate nearest neighbor (ANN) image search problem with both binary embedding and quantization. Comprehensive experiments show that the proposed approach can achieve similar or better accuracy as the ex- isting state-of-the-art but with significantly less time and memory.

Felix X. Yu, Yunchao Gong, and Sanjiv Kumar 

Binary embedding of high-dimensional data requires long codes to pre- serve the discriminative power of the input space. Traditional binary coding meth- ods often suffer from very high computation and storage costs in such a scenario. To address this problem, we propose two solutions which improve over existing approaches. The first method, Bilinear Binary Embedding (BBE), converts high- dimensional data to compact similarity-preserving binary codes using compact bi- linear projections. Compared to methods that use unstructured matrices for projec- tion, it improves the time complexity from O ( d 2 ) to O ( d 1 : 5 ) , and the space com- plexity from O ( d 2 ) to O ( d ) where d is the input dimensionality. The second method, Circulant Binary Embedding (CBE), generates binary codes by projecting the data with a circulant matrix. The circulant structure enables the use of Fast Fourier Trans- formation to speed up the computation. This further improves the time complexity to O ( d log d ) . For both BBE and CBE, we propose to learn the projections in a data- dependent fashion. We show by extensive experiments that the proposed approaches give much better performance than the state-of-the-arts for fixed time, and provides much faster computation with no performance degradation for fixed number of bits. The BBE and CBE methods were previously presented in [5, 38]. In this book chap- ter, we present the two approaches in an unified framework, covering randomized binary embedding, learning-based binary embedding, and learning with dimension reductions. We also discuss the choice of algorithms 

Quantization based Fast Inner Product Search by Ruiqi Guo, Sanjiv Kumar, Krzysztof Choromanski, David Simcha

We propose a quantization based approach for fast approximate Maximum Inner Product Search (MIPS). Each database vector is quantized in multiple subspaces via a set of codebooks, learned directly by minimizing the inner product quantization error. Then, the inner product of a query to a database vector is approximated as the sum of inner products with the subspace quantizers. Different from recently proposed LSH approaches to MIPS, the database vectors and queries do not need to be augmented in a higher dimensional feature space. We also provide a theoretical analysis of the proposed approach, consisting of the concentration results under mild assumptions. Furthermore, if a small sample of example queries is given at the training time, we propose a modified codebook learning procedure which further improves the accuracy. Experimental results on a variety of datasets including those arising from deep neural networks show that the proposed approach significantly outperforms the existing state-of-the-art.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: