While reading up some RandNLA papers, I stumbled upon the Randomized Numerical Linear Algebra for Large Scale Data Analysis page which has been added to the collection of Highly Technical Reference Pages. Here is what the main page says:

The Sketching Linear Algebra Kernel is a library for matrix computations suitable for general statistical data analysis and optimization applications.Many tasks in machine learning and statistics ultimately end up being problems involving matrices: whether you're matching lenders and loans in the microfinance space, or finding the key players in the bitcoin market, or inferring where tweets came from, you'll want to have a toolkit for low-rank matrix approximation, least-squares and robust regression, eigenvector analysis, CUR and non-negative matrix factorizations, and other matrix computations.Sketching is a way to compress matrices that preserves key matrix properties; it can be used to speed up many matrix computations. Sketching takes a given matrix A and produces a sketch matrixB that has fewer rows and/or columns than A. For a good sketch B, if we solve a problem with inputB, the solution will also be pretty good for input A. For some problems, sketches can also be used to get faster ways to find high-precision solutions to the original problem. In other cases, sketches can be used to summarize the data by identifying the most important rows or columns.A simple example of sketching is just sampling the rows (and/or columns) of the matrix, where each row (and/or column) is equally likely to sampled. This uniform sampling is quick and easy, but doesn't always yield good sketches; however, there are sophisticated sampling methods that do yield good sketches.The goal of this project is to build a sketching-based open-source software stack for NLA

and its applications, as shown:

Matrix Completion

Nonlinear RLS,

SVM, PCA

Robust Regression

Other applications

Python: Python-based data analytics scripting layer

PythonBinding: C++ to Python bindings

NLA: Numerical Linear Algebra primitives

(Least squares regression, low-rank approximation, randomized estimators)

Sketch: Sketching kernels

JL, FJL, Gaussian, Sign, Sparse Embedding

Third-Party Libraries:

MPI, Elemental, BLAS, CombBLAS, FFTW, Boost

Here are the recent papers from this RandNLA effort

#### Random Laplace Feature Maps for Semigroup Kernels on Histograms

Jiyan Yang, Vikas Sindhwani, Quanfu Fan, Haim Avron, Michael Mahoney*IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014*

#### Efficient Dimensionality Reduction for Canonical Correlation Analysis

Haim Avron, Christos Boutsidis, Sivan Toledo, Anastasios Zouzias*SIAM Journal on Scientific Computing*

*, to appear*, 2014

Preliminary version appeared in the Proceedings of the 30th International Conference on Machine Learning (ICML), 2013

#### Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

Jiyan Yang*, Vikas Sindhwani*, Haim Avron*, Michael Mahoney*Proceedings of the 31th International Conference on Machine Learning (ICML)*, 2014

(*) Equal contributors.

#### Kernel Methods Match Deep Neural Networks on TIMIT

Po-Sen Huang, Haim Avron, Tara Sainath, Vikas Sindhwani, Bhuvana Ramabhadran*IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)*, 2014

Best Student Paper Award

#### Optimal CUR Matrix Decompositions

Christos Boutsidis, David Woodruff*ACM Symposium on Theory of Computing (STOC)*, 2014

#### Efficient Dimensionality Reduction for Canonical Correlation Analysis

Haim Avron, Christos Boutsidis, Sivan Toledo, Anastasios Zouzias*SIAM Journal on Scientific Computing, to appear*, 2014

#### Faster SVD-truncated Regularized Least-squares

Christos Boutsidis, Malik Magdon-IsmailTechnical Report, 2014

#### A Note on Sparse Least-squares Regression

C. Boutsidis and M. Magdon-Ismail*Information Processing Letters, to appear*, 2014

#### Approximate Spectral Clustering via Randomized Sketching

A. Gittens, A. Kambadur, C. Boutsidis.*Technical Report, updated Feb 15*, 2014

#### Revisiting Asynchronous Linear Solvers: Provable Convergence Rate Through Randomization

Haim Avron, Alex Druinsky, Anshul Gupta*Proceeding of the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS)*, 2014

#### Random Projections for Linear Support Vector Machines

S. Paul, C. Boutsidis, M. Magdon-Ismail, P. Drineas*ACM Transactions on Knowledge Discovery from Data, to appear*, 2014

Also of interest is libSkylark: Sketching based Matrix computations for Machine Learning on GitHub. From the page:

IntroductionThe Sketching based Matrix computations for Machine Learning is a library for matrix computations suitable for general statistical data analysis and optimization applications.Many tasks in machine learning and statistics ultimately end up being problems involving matrices: whether you're finding the key players in the bitcoin market, or inferring where tweets came from, or figuring out what's in sewage, you'll want to have a toolkit for least-squares and robust regression, eigenvector analysis, non-negative matrix factorization, and other matrix computations.

Sketchingis a way to compress matrices that preserves key matrix properties; it can be used to speed up many matrix computations. Sketching takes a given matrix A and produces a sketch matrix B that has fewer rows and/or columns than A. For a good sketch B, if we solve a problem with input B, the solution will also be pretty good for input A. For some problems, sketches can also be used to get faster ways to find high-precision solutions to the original problem. In other cases, sketches can be used to summarize the data by identifying the most important rows or columns.A simple example of sketching is just sampling the rows (and/or columns) of the matrix, where each row (and/or column) is equally likely to be sampled. This uniform sampling is quick and easy, but doesn't always yield good sketches; however, there are sophisticated sampling methods that do yield good sketches.The goal of this project is to build a sketching-based open-source software stack for NLA and its applications, as shown:

Matrix Completion Nonlinear RLS,

SVM, PCARobust Regression Other applications Python:Python-based data analytics scripting layerPythonBinding:C++ to Python bindingsNLA:Numerical Linear Algebra primitives

(Least squares regression, low-rank approximation, randomized estimators)Sketch:Sketching kernels

JL, FJL, Gaussian, Sign, Sparse EmbeddingThird-Party Libraries:

MPI, Elemental, BLAS, CombBLAS, FFTW, Boost

Navcam: Left B

2014-07-04 20:21:29 UTC

This image was taken by Navcam: Left B (NAV_LEFT_B) onboard NASA's Mars rover Curiosity on Sol 679 (2014-07-04 20:21:29 UTC).

Full Resolution

**Join the CompressiveSensing subreddit or the Google+ Community and post there !**

## No comments:

Post a Comment