Nuit Blanche: Random Feature Roundup

Thursday, January 28, 2016

Random Feature Roundup

Here is the new Random Features roundup:

Incremental Semiparametric Inverse Dynamics Learning by Raffaello Camoriano, Silvio Traversaro, Lorenzo Rosasco, Giorgio Metta, Francesco Nori

This paper presents a novel approach for incremental semiparametric inverse dynamics learning. In particular, we consider the mixture of two approaches: Parametric modeling based on rigid body dynamics equations and nonparametric modeling based on incremental kernel methods, with no prior information on the mechanical properties of the system. This yields to an incremental semiparametric approach, leveraging the advantages of both the parametric and nonparametric models. We validate the proposed technique learning the dynamics of one arm of the iCub humanoid robot.

Persistence weighted Gaussian kernel for topological data analysis by Genki Kusano, Kenji Fukumizu, Yasuaki Hiraoka

Topological data analysis is an emerging mathematical concept for characterizing shapes in multi-scale data. In this field, persistence diagrams are widely used as a descriptor of the input data, and can distinguish robust and noisy topological properties. Nowadays, it is highly desired to develop a statistical framework on persistence diagrams to deal with practical data. This paper proposes a kernel method on persistence diagrams. A theoretical contribution of our method is that the proposed kernel allows one to control the effect of persistence, and, if necessary, noisy topological properties can be discounted in data analysis. Furthermore, the method provides a fast approximation technique. The method is applied into several problems including practical data in physics, and the results show the advantage compared to the existing kernel method on persistence diagrams.

Learning to Filter with Predictive State Inference Machines by Wen Sun, Arun Venkatraman, Byron Boots, J. Andrew Bagnell

Latent state space models are one of the most fundamental and widely used tools for modeling dynamical systems. Traditional Maximum Likelihood Estimation (MLE) based approaches aim to maximize the likelihood objective, which is non-convex due to latent states. While non-convex optimization methods like EM can learn models that locally optimize the likelihood objective, using the locally optimal model for an inference task such as Bayesian filtering usually does not have performance guarantees. In this work, we propose a method that considers the inference procedure on the dynamical system as a composition of predictors. Instead of optimizing a given parametrization of latent states, we learn predictors for inference in predictive belief space, where we can use sufficient features of observations for supervision of our learning algorithm. We further show that our algorithm, the Predictive State Inference Machine, has theoretical performance guarantees on the inference task. Empirical verification across several of dynamical system benchmarks ranging from a simulated helicopter to recorded telemetry traces from a robot showcase the abilities of training Inference Machines.

Randomized Kernel Approach for Named Entity Recognition in Tamil by N. Abinaya *, M. Anand Kumar , K. P. Soman

In this paper, we present a new approach for Named Entity Recognition (NER) in Tamil language using Random Kitchen Sink algorithm. Named Entity recognition is the process of identification of Named Entities (NEs) from the text. It involves the identifying and classifying predefined categories such as person, location, organization etc. A lot of work has been done in the field of Named Entity Recognition for English language and Indian languages using various machine learning approaches. In this work, we implement the NER system for Tamil using Random Kitchen Sink algorithm which is a statistical and supervised approach. The NER system is also implemented using Support Vector Machine (SVM) and Conditional Random Field (CRF). The overall performance of the NER system was evaluated as 86.61% for RKS, 81.62% for SVM and 87.21% for CRF. Additional results have been taken in SVM and CRF by increasing the corpus size and the performance are evaluated as 86.06% and 87.20% respectively.

slides: Convergence Rates of Kernel Quadrature Rules by Francis Bach at the NIPS workshop on probabilistic integration - Dec. 2015

Learning the kernel matrix via predictive low-rank approximations
Martin Stražar, Tomaž Curk

Efficient and accurate low-rank approximations to multiple data sources are essential in the era of big data. The scaling of kernel-based learning algorithms to large datasets is limited by the square complexity associated with computation and storage of the kernel matrix, which is assumed to be available in recent most multiple kernel learning algorithms. We propose a method to learn simultaneous low-rank approximations of a set of base kernels in regression tasks.
We present the Mklaren algorithm, which approximates multiple kernel matrices with least angle regression in the low-dimensional feature space. The idea is based on entirely geometrical concepts and does not assume access to full kernel matrices. The algorithm achieves linear complexity in the number of data points as well as kernels, while it accounts for the correlations between kernel matrices. When explicit feature space representation is available for kernels, we use the relation between primal and dual regression weights to gain model interpretation. Our approach outperforms contemporary kernel matrix approximation approaches when learning with multiple kernels on standard regression datasets, as well as improves selection of relevant kernels in comparison to multiple kernel learning methods.

Fast projections of spatial rich model feature for digital image steganalysis by Pengfei Wang , Zhihui Wei, Liang Xiao

Spatial rich model (SRM) is a classic steganalysis method, which collects high-order co-occurrences from truncated noise residuals as feature to capture the local-range dependencies of an image. Increasing the truncation threshold and the co-occurrence order will lead to a higher-dimensional feature, which can exploit more statistical bins and capture dependencies across larger-range neighborhood, but this will suffer from the curse of dimensionality. In this paper, we propose a fast projection method to increase the statistical robustness of the higher-dimensional SRM feature while decreasing its dimensionality. The proposed projection method is applicable to co-occurrence-based steganalysis features. The detection performance and the computational complexity of the proposed method are investigated on three content-adaptive steganographic algorithms in spatial domain.

Low-Rank Kernel Space Representations in Prototype Learning by Kerstin Bunte, Marika Kaden, Frank-Michael Schleif

In supervised learning feature vectors are often implicitly mapped to a high-dimensional space using the kernel trick with quadratic costs for the learning algorithm. The recently proposed random Fourier features provide an explicit mapping such that classical algorithms with often linear complexity can be applied. Yet, the random Fourier feature approach remains widely complex techniques which are difficult to interpret. Using Matrix Relevance Learning the linear mapping of the data for a better class separation can be learned by adapting a parametric Euclidean distance. Further, a low-rank representation of the input data can be obtained. We apply this technique to random Fourier feature encoded data to obtain a discriminative mapping of the kernel space. This explicit approach is compared with a differentiable kernel vector quantizer on the same but implicit kernel representation. Using multiple benchmark problems, we demonstrate that a parametric distance on a RBF encoding yields to better classification results and permits access to interpretable prediction models with visualization abilities.

Fast and Accurate Refined Nystr̈om based Kernel SVM by Zhe Li and Tianbao Yang and Lijun Zhang and Rong Jin

In this paper, we focus on improving the performance of the Nystr̈om based kernel SVM. Although the Nystr̈om approximation has been studied extensively and its application to kernel classification has been exhibited in several studies, there still exists a potentially large gap between the performance of classifier learned with the Nystr̈om approximation and that learned with the original kernel. In this work, we make novel contributions to bridge the gap without increasing the training costs too much by proposing a refined Nystr̈om based kernel classifier. We adopt a two-step approach that in the first step we learn a sufficiently good dual solution and in the second step we use the obtained dual solution to construct a new set of bases for the Nystr̈om approximation to re-train a refined classifier. Our approach towards learning a good dual solution is based on a sparse-regularized dual formulation with the Nystr̈om approximation, which can be solved with the same time complexity as solving the standard formulation. We justify our approach by establishing a theoretical guarantee on the error of the learned dual solution in the first step with respect to the optimal dual solution under appropriate conditions. The experimental results demonstrate that (i) the obtained dual solution by our approach in the first step is closer to the optimal solution and yields improved prediction performance; and (ii) the second step using the obtained dual solution to re-train the modelfurther improves the performance

ADMM based scalable machine learning on Spark
Dhar, S. Congrui Yi ; Ramakrishnan, N. ; Shah, M.

Most machine learning algorithms involve solving a convex optimization problem. Traditional in-memory convex optimization solvers do not scale well with the increase in data. This paper identifies a generic convex problem for most machine learning algorithms and solves it using the Alternating Direction Method of Multipliers (ADMM). Finally such an ADMM problem transforms to an iterative system of linear equations, which can be easily solved at scale in a distributed fashion. We implement this framework in Apache Spark and compare it with the widely used Machine Learning LIBrary (MLLIB) in Apache Spark 1.3.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !