While reading up some RandNLA papers, I stumbled upon the Randomized Numerical Linear Algebra for Large Scale Data Analysis page which has been added to the collection of Highly Technical Reference Pages. Here is what the main page says:
The Sketching Linear Algebra Kernel is a library for matrix computations suitable for general statistical data analysis and optimization applications.Many tasks in machine learning and statistics ultimately end up being problems involving matrices: whether you're matching lenders and loans in the microfinance space, or finding the key players in the bitcoin market, or inferring where tweets came from, you'll want to have a toolkit for low-rank matrix approximation, least-squares and robust regression, eigenvector analysis, CUR and non-negative matrix factorizations, and other matrix computations.Sketching is a way to compress matrices that preserves key matrix properties; it can be used to speed up many matrix computations. Sketching takes a given matrix A and produces a sketch matrixB that has fewer rows and/or columns than A. For a good sketch B, if we solve a problem with inputB, the solution will also be pretty good for input A. For some problems, sketches can also be used to get faster ways to find high-precision solutions to the original problem. In other cases, sketches can be used to summarize the data by identifying the most important rows or columns.A simple example of sketching is just sampling the rows (and/or columns) of the matrix, where each row (and/or column) is equally likely to sampled. This uniform sampling is quick and easy, but doesn't always yield good sketches; however, there are sophisticated sampling methods that do yield good sketches.The goal of this project is to build a sketching-based open-source software stack for NLA
and its applications, as shown:
Matrix Completion
Nonlinear RLS,
SVM, PCA
Robust Regression
Other applications
Python: Python-based data analytics scripting layer
PythonBinding: C++ to Python bindings
NLA: Numerical Linear Algebra primitives
(Least squares regression, low-rank approximation, randomized estimators)
Sketch: Sketching kernels
JL, FJL, Gaussian, Sign, Sparse Embedding
Third-Party Libraries:
MPI, Elemental, BLAS, CombBLAS, FFTW, Boost
Here are the recent papers from this RandNLA effort
Random Laplace Feature Maps for Semigroup Kernels on Histograms
Jiyan Yang, Vikas Sindhwani, Quanfu Fan, Haim Avron, Michael MahoneyIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
Efficient Dimensionality Reduction for Canonical Correlation Analysis
Haim Avron, Christos Boutsidis, Sivan Toledo, Anastasios ZouziasSIAM Journal on Scientific Computing, to appear, 2014
Preliminary version appeared in the Proceedings of the 30th International Conference on Machine Learning (ICML), 2013
Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels
Jiyan Yang*, Vikas Sindhwani*, Haim Avron*, Michael MahoneyProceedings of the 31th International Conference on Machine Learning (ICML), 2014
(*) Equal contributors.
Kernel Methods Match Deep Neural Networks on TIMIT
Po-Sen Huang, Haim Avron, Tara Sainath, Vikas Sindhwani, Bhuvana RamabhadranIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014
Best Student Paper Award
Optimal CUR Matrix Decompositions
Christos Boutsidis, David WoodruffACM Symposium on Theory of Computing (STOC), 2014
Efficient Dimensionality Reduction for Canonical Correlation Analysis
Haim Avron, Christos Boutsidis, Sivan Toledo, Anastasios ZouziasSIAM Journal on Scientific Computing, to appear, 2014
Faster SVD-truncated Regularized Least-squares
Christos Boutsidis, Malik Magdon-IsmailTechnical Report, 2014
A Note on Sparse Least-squares Regression
C. Boutsidis and M. Magdon-IsmailInformation Processing Letters, to appear, 2014
Approximate Spectral Clustering via Randomized Sketching
A. Gittens, A. Kambadur, C. Boutsidis.Technical Report, updated Feb 15, 2014
Revisiting Asynchronous Linear Solvers: Provable Convergence Rate Through Randomization
Haim Avron, Alex Druinsky, Anshul GuptaProceeding of the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS) , 2014
Random Projections for Linear Support Vector Machines
S. Paul, C. Boutsidis, M. Magdon-Ismail, P. DrineasACM Transactions on Knowledge Discovery from Data, to appear, 2014
Also of interest is libSkylark: Sketching based Matrix computations for Machine Learning on GitHub. From the page:
IntroductionThe Sketching based Matrix computations for Machine Learning is a library for matrix computations suitable for general statistical data analysis and optimization applications.Many tasks in machine learning and statistics ultimately end up being problems involving matrices: whether you're finding the key players in the bitcoin market, or inferring where tweets came from, or figuring out what's in sewage, you'll want to have a toolkit for least-squares and robust regression, eigenvector analysis, non-negative matrix factorization, and other matrix computations.
Sketching is a way to compress matrices that preserves key matrix properties; it can be used to speed up many matrix computations. Sketching takes a given matrix A and produces a sketch matrix B that has fewer rows and/or columns than A. For a good sketch B, if we solve a problem with input B, the solution will also be pretty good for input A. For some problems, sketches can also be used to get faster ways to find high-precision solutions to the original problem. In other cases, sketches can be used to summarize the data by identifying the most important rows or columns.A simple example of sketching is just sampling the rows (and/or columns) of the matrix, where each row (and/or column) is equally likely to be sampled. This uniform sampling is quick and easy, but doesn't always yield good sketches; however, there are sophisticated sampling methods that do yield good sketches.The goal of this project is to build a sketching-based open-source software stack for NLA and its applications, as shown:
Matrix Completion Nonlinear RLS,
SVM, PCARobust Regression Other applications Python: Python-based data analytics scripting layer PythonBinding: C++ to Python bindings NLA: Numerical Linear Algebra primitives
(Least squares regression, low-rank approximation, randomized estimators)Sketch: Sketching kernels
JL, FJL, Gaussian, Sign, Sparse EmbeddingThird-Party Libraries:
MPI, Elemental, BLAS, CombBLAS, FFTW, Boost
Navcam: Left B
2014-07-04 20:21:29 UTC
This image was taken by Navcam: Left B (NAV_LEFT_B) onboard NASA's Mars rover Curiosity on Sol 679 (2014-07-04 20:21:29 UTC).
Full Resolution
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
No comments:
Post a Comment