Nuit Blanche: 01/01/2016

Sunday, January 31, 2016

Nuit Blanche in Review (January 2016)

Since the last Nuit Blanche in Review(December 2015) we had one implementation, a few in-depth entries, abook, three videos, four jobs and more. Enjoy !

Implementation

Random sub-Nyquist polarimetric modulator - implementation -

In-depth:

Book:

Book: Statistical Learning with Sparsity: The Lasso and Generalizations by Trevor Hastie, Robert Tibshirani, Martin Wainwright

Job:

Job: PhD studentship and Postdocs, Fellowship program at IBM Research related to data science for social good
CSjob: Reader (Associate Professor) in Machine Learning & Computer Vision, University of Surrey, UK
Job: Researcher (equivalent to Lecturer or higher) / Post Doctoral Researcher, Computational Mass Spectrometry Imaging and Informatics, NPL? Teddington, UK
Job: 2 Postdocs and 2 PhD, Low-dimensional representations of high-dimensional data based on low-rank matrix approximations, in Mons, Belgium

Meetings

SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing 4-8 April 2016, Ilmenau, Germany

Videos:

Paris Machine Learning

Credit photo: Courtesy of NASA/SDO and the AIA, EVE, and HMI science teams.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, January 30, 2016

Saturday Morning Video: Recent advances in deep learning, Oriol Vinyals, Google

Recent advanced in Deep Learning

Recent advances in deep learning, Oriol Vinyals, Google

Support for the Stanford Colloquium on Computer Systems Seminar Series provided by the Stanford Computer Forum.

Speaker Abstract and Bio can be found here: http://ee380.stanford.edu/Abstracts/1...

Colloquium on Computer Systems Seminar Series (EE380) presents the current research in design, implementation, analysis, and use of computer systems. Topics range from integrated circuits to operating systems and programming languages. It is free and open to the public, with new lectures each week.

Learn more: http://bit.ly/WinYX5

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, January 29, 2016

Fast Integral Image Estimation at 1% measurement rate

This all started in part with the classification work of the Rice group a while ago. Today's work shows how small some of the measurement rate can be.

Fast Integral Image Estimation at 1% measurement rate by Kuldeep Kulkarni, Pavan Turaga

We propose a framework called ReFInE to directly obtain integral image estimates from a very small number of spatially multiplexed measurements of the scene without iterative reconstruction of any auxiliary image, and demonstrate their practical utility in visual object tracking. Specifically, we design measurement matrices which are tailored to facilitate extremely fast estimation of the integral image, by using a single-shot linear operation on the measured vector. Leveraging a prior model for the images, we formulate a nuclear norm minimization problem with second order conic constraints to jointly obtain the measurement matrix and the linear operator. Through qualitative and quantitative experiments, we show that high quality integral image estimates can be obtained using our framework at very low measurement rates. Further, on a standard dataset of 50 videos, we present object tracking results which are comparable to the state-of-the-art methods, even at an extremely low measurement rate of 1%.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements

Ah ! Another instance of the Great Convergence. This time it is by using the measurement step in compressive sensing as the first layer of a neural network. Intsead of a traditional iterative solver for the reconstruction or the simple one-through least squares step of ELMs, the authors go for a deep architecture much like recently.

Why is this study a big deal ? For the longest time, people in Deep Learning have been using convolutional steps in order to remove the symmetry associated with translation (the Fourier transform allows the comparison of objects translated from one another easily). In this paper, the authors show that, you can break that symmetry early on the very first layers (here it is layer zero) and still do well. This is a big deal in my view.

Woohoo !

ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements by Kuldeep Kulkarni, Suhas Lohit, Pavan Turaga, Ronan Kerviche, Amit Ashok

The goal of this paper is to present a non-iterative and more importantly an extremely fast algorithm to reconstruct images from compressively sensed (CS) random measurements. To this end, we propose a novel convolutional neural network (CNN) architecture which takes in CS measurements of an image as input and outputs an intermediate reconstruction. We call this network, ReconNet. The intermediate reconstruction is fed into an off-the-shelf denoiser to obtain the final reconstructed image. On a standard dataset of images we show significant improvements in reconstruction results (both in terms of PSNR and time complexity) over state-of-the-art iterative CS reconstruction algorithms at various measurement rates. Further, through qualitative experiments on real data collected using our block single pixel camera (SPC), we show that our network is highly robust to sensor noise and can recover visually better quality images than competitive algorithms at extremely low sensing rates of 0.1 and 0.04. To demonstrate that our algorithm can recover semantically informative images even at a low measurement rate of 0.01, we present a very robust proof of concept real-time visual tracking application.

h/t Thomas

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, January 28, 2016

Random Feature Roundup

Here is the new Random Features roundup:

Incremental Semiparametric Inverse Dynamics Learning by Raffaello Camoriano, Silvio Traversaro, Lorenzo Rosasco, Giorgio Metta, Francesco Nori

This paper presents a novel approach for incremental semiparametric inverse dynamics learning. In particular, we consider the mixture of two approaches: Parametric modeling based on rigid body dynamics equations and nonparametric modeling based on incremental kernel methods, with no prior information on the mechanical properties of the system. This yields to an incremental semiparametric approach, leveraging the advantages of both the parametric and nonparametric models. We validate the proposed technique learning the dynamics of one arm of the iCub humanoid robot.

Persistence weighted Gaussian kernel for topological data analysis by Genki Kusano, Kenji Fukumizu, Yasuaki Hiraoka

Topological data analysis is an emerging mathematical concept for characterizing shapes in multi-scale data. In this field, persistence diagrams are widely used as a descriptor of the input data, and can distinguish robust and noisy topological properties. Nowadays, it is highly desired to develop a statistical framework on persistence diagrams to deal with practical data. This paper proposes a kernel method on persistence diagrams. A theoretical contribution of our method is that the proposed kernel allows one to control the effect of persistence, and, if necessary, noisy topological properties can be discounted in data analysis. Furthermore, the method provides a fast approximation technique. The method is applied into several problems including practical data in physics, and the results show the advantage compared to the existing kernel method on persistence diagrams.

Learning to Filter with Predictive State Inference Machines by Wen Sun, Arun Venkatraman, Byron Boots, J. Andrew Bagnell

Latent state space models are one of the most fundamental and widely used tools for modeling dynamical systems. Traditional Maximum Likelihood Estimation (MLE) based approaches aim to maximize the likelihood objective, which is non-convex due to latent states. While non-convex optimization methods like EM can learn models that locally optimize the likelihood objective, using the locally optimal model for an inference task such as Bayesian filtering usually does not have performance guarantees. In this work, we propose a method that considers the inference procedure on the dynamical system as a composition of predictors. Instead of optimizing a given parametrization of latent states, we learn predictors for inference in predictive belief space, where we can use sufficient features of observations for supervision of our learning algorithm. We further show that our algorithm, the Predictive State Inference Machine, has theoretical performance guarantees on the inference task. Empirical verification across several of dynamical system benchmarks ranging from a simulated helicopter to recorded telemetry traces from a robot showcase the abilities of training Inference Machines.

Randomized Kernel Approach for Named Entity Recognition in Tamil by N. Abinaya *, M. Anand Kumar , K. P. Soman

In this paper, we present a new approach for Named Entity Recognition (NER) in Tamil language using Random Kitchen Sink algorithm. Named Entity recognition is the process of identification of Named Entities (NEs) from the text. It involves the identifying and classifying predefined categories such as person, location, organization etc. A lot of work has been done in the field of Named Entity Recognition for English language and Indian languages using various machine learning approaches. In this work, we implement the NER system for Tamil using Random Kitchen Sink algorithm which is a statistical and supervised approach. The NER system is also implemented using Support Vector Machine (SVM) and Conditional Random Field (CRF). The overall performance of the NER system was evaluated as 86.61% for RKS, 81.62% for SVM and 87.21% for CRF. Additional results have been taken in SVM and CRF by increasing the corpus size and the performance are evaluated as 86.06% and 87.20% respectively.

slides: Convergence Rates of Kernel Quadrature Rules by Francis Bach at the NIPS workshop on probabilistic integration - Dec. 2015

Learning the kernel matrix via predictive low-rank approximations
Martin Stražar, Tomaž Curk

Efficient and accurate low-rank approximations to multiple data sources are essential in the era of big data. The scaling of kernel-based learning algorithms to large datasets is limited by the square complexity associated with computation and storage of the kernel matrix, which is assumed to be available in recent most multiple kernel learning algorithms. We propose a method to learn simultaneous low-rank approximations of a set of base kernels in regression tasks.
We present the Mklaren algorithm, which approximates multiple kernel matrices with least angle regression in the low-dimensional feature space. The idea is based on entirely geometrical concepts and does not assume access to full kernel matrices. The algorithm achieves linear complexity in the number of data points as well as kernels, while it accounts for the correlations between kernel matrices. When explicit feature space representation is available for kernels, we use the relation between primal and dual regression weights to gain model interpretation. Our approach outperforms contemporary kernel matrix approximation approaches when learning with multiple kernels on standard regression datasets, as well as improves selection of relevant kernels in comparison to multiple kernel learning methods.

Fast projections of spatial rich model feature for digital image steganalysis by Pengfei Wang , Zhihui Wei, Liang Xiao

Spatial rich model (SRM) is a classic steganalysis method, which collects high-order co-occurrences from truncated noise residuals as feature to capture the local-range dependencies of an image. Increasing the truncation threshold and the co-occurrence order will lead to a higher-dimensional feature, which can exploit more statistical bins and capture dependencies across larger-range neighborhood, but this will suffer from the curse of dimensionality. In this paper, we propose a fast projection method to increase the statistical robustness of the higher-dimensional SRM feature while decreasing its dimensionality. The proposed projection method is applicable to co-occurrence-based steganalysis features. The detection performance and the computational complexity of the proposed method are investigated on three content-adaptive steganographic algorithms in spatial domain.

Low-Rank Kernel Space Representations in Prototype Learning by Kerstin Bunte, Marika Kaden, Frank-Michael Schleif

In supervised learning feature vectors are often implicitly mapped to a high-dimensional space using the kernel trick with quadratic costs for the learning algorithm. The recently proposed random Fourier features provide an explicit mapping such that classical algorithms with often linear complexity can be applied. Yet, the random Fourier feature approach remains widely complex techniques which are difficult to interpret. Using Matrix Relevance Learning the linear mapping of the data for a better class separation can be learned by adapting a parametric Euclidean distance. Further, a low-rank representation of the input data can be obtained. We apply this technique to random Fourier feature encoded data to obtain a discriminative mapping of the kernel space. This explicit approach is compared with a differentiable kernel vector quantizer on the same but implicit kernel representation. Using multiple benchmark problems, we demonstrate that a parametric distance on a RBF encoding yields to better classification results and permits access to interpretable prediction models with visualization abilities.

Fast and Accurate Refined Nystr̈om based Kernel SVM by Zhe Li and Tianbao Yang and Lijun Zhang and Rong Jin

In this paper, we focus on improving the performance of the Nystr̈om based kernel SVM. Although the Nystr̈om approximation has been studied extensively and its application to kernel classification has been exhibited in several studies, there still exists a potentially large gap between the performance of classifier learned with the Nystr̈om approximation and that learned with the original kernel. In this work, we make novel contributions to bridge the gap without increasing the training costs too much by proposing a refined Nystr̈om based kernel classifier. We adopt a two-step approach that in the first step we learn a sufficiently good dual solution and in the second step we use the obtained dual solution to construct a new set of bases for the Nystr̈om approximation to re-train a refined classifier. Our approach towards learning a good dual solution is based on a sparse-regularized dual formulation with the Nystr̈om approximation, which can be solved with the same time complexity as solving the standard formulation. We justify our approach by establishing a theoretical guarantee on the error of the learned dual solution in the first step with respect to the optimal dual solution under appropriate conditions. The experimental results demonstrate that (i) the obtained dual solution by our approach in the first step is closer to the optimal solution and yields improved prediction performance; and (ii) the second step using the obtained dual solution to re-train the modelfurther improves the performance

ADMM based scalable machine learning on Spark
Dhar, S. Congrui Yi ; Ramakrishnan, N. ; Shah, M.

Most machine learning algorithms involve solving a convex optimization problem. Traditional in-memory convex optimization solvers do not scale well with the increase in data. This paper identifies a generic convex problem for most machine learning algorithms and solves it using the Alternating Direction Method of Multipliers (ADMM). Finally such an ADMM problem transforms to an iterative system of linear equations, which can be easily solved at scale in a distributed fashion. We implement this framework in Apache Spark and compare it with the widely used Machine Learning LIBrary (MLLIB) in Apache Spark 1.3.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, January 27, 2016

SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing 4-8 April 2016, Ilmenau, Germany

Mark just sent me the following:

Dear Igor,

The SpaRTaN-MacSeNet Spring School below may be of interest to Nuit Blanche readers. Best wishes,

Mark

-------

SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing4-8 April 2016, Ilmenau, Germany

www.macsenet.eu/SpringSchool

The SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing Spring School will be of interest to graduate students, researchers and industry professionals working in this fast moving and exciting area. The five day school is split into two components, during three days, a panel of experts will offer lectures and tutorials covering the theory of sparse representations, compressed sensing and related topics, and applications of these methods in areas such as image processing, audio signal processing, and signal processing on graphs. The remaining two days will be devoted to software carpentry, giving researchers the computing skills they need to get more done in less time and with less pain.

Topics and speakers will include:

Gerald Schuller (Fraunhofer IDMT): Neural Networks and Sparse Coding from the Signal Processing Perspective

Francis Bach (INRIA): Large Scale Optimization; Probabilistic Modelling

Mike Davies (University of Edinburgh): Compressed Sensing Theory and Extensions

Mario Figueiredo (Instituto de Telecomunicações): Convex Optimization in Inverse Problems and Machine Learning

Sergios Theodoridis (University of Athens): Learning in Reproducing Kernel Hilbert Spaces

Ian Marshall (University of Edinburgh): Compressed Sensing in MRI

Wenwu Wang (University of Surrey): Sparse Representation and Dictionary Learning for Source Separation, Localisation and Tracking

Sacha Krstlovic (Audio Analytic): Audio Event detection in an Industrial Environment

Karen Egiazarian (Noiseless Imaging): Image Denoising and Enhancement

Pierre Vandergheynst (EPFL) Graph Signal Processing

Jakob Abesser (Fraunhofer IDMT): Instrument Centered Parameter Sstimation and Sound Synthesis

The programme will include a Keynote talk by Prof. Karlheinz Brandenburg (Fraunhofer IDMT).

There will also be an opportunity for participants to present a poster, giving a chance to discuss their own work with their peers and the speakers.

How to apply:

To apply, please submit the following information to macsenet@surrey.ac.uk:
* Full name
* Email address
* Phone Number
* Address
* Affiliation
* Your CV
* A cover letter of up to 2000 characters
* A short letter of recommendation from one referee of your choice
* (optional) Title of the poster you would present

For successful applicants there will be a small registration fee of £300 to cover the cost of their meals and accommodation.

Important Dates

2016-01-18 Applications opens
2016-02-26 Deadline for Applications
2016-03-11 Notification of acceptance
2016-04-04 Spring School Starts

Organizers

The SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing is organized and sponsored by the SpaRTaN (Sparse Representations and Compressed Sensing Training Network, www.spartan-itn.eu) Marie Curie Initial Training Network (ITN) funded by the EU Seventh Framework Programme (FP7-PEOPLE-ITN-2013-607290), and the MacSeNet (Machine Sensing Training Network, www.macsenet.eu) Marie Sklodowska-Curie Action Innovative Training Network (ITN) funded by the EU H2020 Programme (H2020-MSCA-ITN-2014-642685).

--
Prof Mark D Plumbley
Professor of Signal Processing
Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey
Guildford, Surrey, GU2 7XH, UK
Email: m.plumbley@surrey.ac.uk

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, January 26, 2016

Multi-Processor Approximate Message Passing Using Lossy Compression

Dror just sent me some background on his two new very interesting preprints:

Hey Igor - ....

I'm writing about two two recent related papers, and I wanted to give you a heads up about them. I'm also copying all the coauthors.

The first paper is a conference paper by all four of us (Puxiao Han, Junan Zhu, myself, and Ruixin Niu), and it will appear on ICASSP in March. The second is a journal submission by Junan Zhu and myself.

The topic is approximate message passing (AMP) with multiple processors (MP-AMP), which is intended to solve large scale signal recovery problems. It could also be used to learn linear models underlying data, of course. One challenge with past versions of MP-AMP is that the processors need to exchange messages, and this communication can be costly. Imagine, for example, running a huge signal recovery problem on several server farms in parallel, and needing to swap messages between locations. This communication slows down the process and is costly. To reduce the communication costs, we apply lossy compression to the messages. Following some optimizations with dynamic programming, we often get really nice reconstruction with only 2 or even 1.5 bits per number being communicated, on average. Contrast this to 32 bit per single precision floating point (not to mention double precision), and you'll see that this is a nice reduction. Our impression from some case studies is that communication costs (prior to optimizing them) can be larger than computational costs in some of these applications, and reducing them 10x can really reduce system-wide costs.

One insight that comes out of the numerical results is that it's better to spend very few bits (a low coding rate) on early iterations of AMP, because in those early iterations the estimation quality is very noisy, and spending lots of bits to encode a junky estimate is basically still junk. In contrast, after several AMP iterations the quality improves, and it becomes more useful to invest in encoding these estimates at improved fidelity. That is, the coding rate per iteration tends to increase.

If you have any questions or comments, we'll be glad to discuss!

Thanks Dror !

Here are the two preprints:
Multi-Processor Approximate Message Passing with Lossy Compression by Junan Zhu, Dror Baron

We consider large-scale linear inverse problems in Bayesian settings. Our general approach follows a recent line of work that applies the approximate message passing (AMP) framework in multi-processor (MP) computational systems by storing and processing a subset of rows of the measurement matrix along with corresponding measurements at each MP node. In each MP-AMP iteration, nodes of the MP system and its fusion center exchange messages pertaining to their estimates of the input. Unfortunately, communicating these messages is often costly in applications such as sensor networks. To reduce the communication costs, we apply lossy compression to the messages. To improve compression, we realize that the rate distortion trade-off differs among MP-AMP iterations, and use dynamic programming (DP) to optimize per-iteration coding rates. Numerical results confirm that the proposed coding rates offer significant and often dramatic reductions in communication costs. That said, computational costs involve two matrix vector products per MP-AMP iteration, which may be significant for large matrices. Therefore, we further improve the trade-off between computational and communication costs with a modified DP scheme. Case studies involving sensor networks and large-scale computing systems highlight the potential of our approach.

Multi-Processor Approximate Message Passing Using Lossy Compression by Puxiao Han, Junan Zhu, Ruixin Niu, Dror Baron

In this paper, a communication-efficient multi-processor compressed sensing framework based on the approximate message passing algorithm is proposed. We perform lossy compression on the data being communicated between processors, resulting in a reduction in communication costs with a minor degradation in recovery quality. In the proposed framework, a new state evolution formulation takes the quantization error into account, and analytically determines the coding rate required in each iteration. Two approaches for allocating the coding rate, an online back-tracking heuristic and an optimal allocation scheme based on dynamic programming, provide significant reductions in communication costs.

Why is this type of approach important ? Well, here is one figure from Timothy Prickett Morgan's Top500: Supercomputing sea change incomplete, unpredictable

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, January 25, 2016

Job: PhD studentship and Postdocs, Fellowship program at IBM Research related to data science for social good

Here in Paris we had a few Machine Learning meetups that had speakers talk about doing something about challenges that were probably not a high priority for companies but remained important (see for instance: Sunday Morning Insight: The Hardest Challenges We Should be Unwilling to Postpone ), so it is interesting and noteworthy when a program like the following gets funded. Kush just let me know of the following opportunity:

Dear Igor,

I am pleased to let you know that Saška Mojsilović and I are launching a new fellowship program at IBM Research related to data science for social good. We are offering both 3-month summer fellowships for PhD students and full-year fellowships for postdocs. The fellowship webpage and link to apply may be found here: http://www.ibm.com/ibm/responsibility/initiatives/IBMSocialGoodFellowship.html

Fellows will come work with research staff members at our Yorktown Heights laboratory to complete projects in partnership with NGOs, social enterprises, government agencies, or other mission-driven organizations that have large social impact. We are currently in the process of scoping projects across various areas, such as health, sustainability, poverty, hunger, equality, and disaster management. The program is intended to allow students to develop their technical skills and produce publishable work while making a positive impact on the world.

Would it be possible for you to spread the word about the program through Nuit Blanche and your Twitter account? (We've set up a Twitter handle for the fellowship @ibmsocialgood which has a couple of tweets you can retweet.)

Thank you.

Kush Varshney

p.s. This blog post has more background on the genesis of the fellowship: http://ibmresearchnews.blogspot.com/2016/01/data-science-for-better-world.html

Kush R Varshney, PhD
Data Science Group | Mathematical Sciences Department
IBM Thomas J Watson Research Center | 1101 Kitchawan Rd, Yorktown Heights, NY 10598
+1-914-945-1628 | http://krvarshney.github.io | @krvarshney

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sub-Sampled Newton Methods

Newton'smethod aided by RandNLA:

Sub-Sampled Newton Methods I: Globally Convergent Algorithms by Farbod Roosta-Khorasani, Michael W. Mahoney

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the computations and/or to implicitly implement a form of statistical regularization. In this paper, we consider second-order iterative optimization algorithms and we provide bounds on the convergence of the variants of Newton's method that incorporate uniform sub-sampling as a means to estimate the gradient and/or Hessian. Our bounds are non-asymptotic and quantitative. Our algorithms are global and are guaranteed to converge from any initial iterate.
Using random matrix concentration inequalities, one can sub-sample the Hessian to preserve the curvature information. Our first algorithm incorporates Hessian sub-sampling while using the full gradient. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or ridge-type regularization. Next, in addition to Hessian sub-sampling, we also consider sub-sampling the gradient as a way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra to obtain the proper sampling strategy. In all these algorithms, computing the update boils down to solving a large scale linear system, which can be computationally expensive. As a remedy, for all of our algorithms, we also give global convergence results for the case of inexact updates where such linear system is solved only approximately.
This paper has a more advanced companion paper, [42], in which we demonstrate that, by doing a finer-grained analysis, we can get problem-independent bounds for local convergence of these algorithms and explore trade-offs to improve upon the basic results of the present paper.

Sub-Sampled Newton Methods II: Local Convergence Rates by Farbod Roosta-Khorasani, Michael W. Mahoney

Many data-fitting applications require the solution of an optimization problem involving a sum of large number of functions of high dimensional parameter. Here, we consider the problem of minimizing a sum of n functions over a convex constraint set X⊆Rp where both n and p are large. In such problems, sub-sampling as a way to reduce n can offer great amount of computational efficiency, while maintaining their original convergence properties.
Within the context of second order methods, we first give quantitative local convergence results for variants of Newton's method where the Hessian is uniformly sub-sampled. Using random matrix concentration inequalities, one can sub-sample in a way that the curvature information is preserved. Using such sub-sampling strategy, we establish locally Q-linear and Q-superlinear convergence rates. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or Levenberg-type regularization.
Finally, in addition to Hessian sub-sampling, we consider sub-sampling the gradient as way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra (RandNLA) to obtain the proper sampling strategy and we establish locally R-linear convergence rates. In such a setting, we also show that a very aggressive sample size increase results in a R-superlinearly convergent algorithm.
While the sample size depends on the condition number of the problem, our convergence rates are problem-independent, i.e., they do not depend on the quantities related to the problem. Hence, our analysis here can be used to complement the results of our basic framework from the companion paper, [38], by exploring algorithmic trade-offs that are important in practice.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, January 23, 2016

Saturday Morning Video: NIPS 2015 Tutorial on Deep Learning by Geoffrey E. Hinton, Yann LeCun, and Yoshua Bengio

If you are stuck in the snow, here is the 2 hours a 26 minute NIPS 2015 Tutorial on Deep Learning by Geoffrey E. Hinton, Yann LeCun, and Yoshua Bengio

Deep Learning allows computational models composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection, and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large datasets by using the back-propagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about dramatic improvements in processing images, video, speech and audio, while recurrent nets have shone on sequential data such as text and speech. Representation learning is a set of methods that allows a machine to be fed with raw data and to automatically discover the representations needed for detection or classification. Deep learning methods are representation learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level. This tutorial will introduce the fundamentals of deep learning, discuss applications, and close with challenges ahead.

Other videos can be found by following the SaturdayMorningVideo tag. Credit photo: NASA, Scott Kelly Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, January 22, 2016

Speckle-learning-based object recognition through scattering media

Ah ! here is something I like:

Speckle-learning-based object recognition through scattering media by Takamasa Ando, Ryoichi Horisaki, and Jun Tanida (pdf is here)

We experimentally demonstrated object recognition through scattering media based on direct machine learning of a number of speckle intensity images. In the experiments, speckle intensity images of amplitude or phase objects on a spatial light modulator between scattering plates were captured by a camera. We used the support vector machine for binary classification of the captured speckle intensity images of face and non-face data. The experimental results showed that speckles are sufficient for machine learning.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, January 21, 2016

Random sub-Nyquist polarimetric modulator - implementation -

Asensio just let me know of a new hardware approach

@IgorCarron still not feasible, but I hope this motivates ways to avoid the atmosphere in astronomical observations https://t.co/aBwbBTErQd
— Andrés Asensio Ramos (@aasensior) January 21, 2016

The approach ...

is made possible by invoking the theory of compressed sensing using the fact that the Fourier powerspectrum of the seeing is weakly sparse.

and there are indeed good reasons to think that it is indeed weakly sparse -see [1]- Here is the analysis: Random sub-Nyquist polarimetric modulator by A. Asensio Ramos

We show that it is possible to measure polarization with a polarimeter that gets rid of the seeing while still measuring at a frequency well below that of the seeing. We study a standard polarimeter made of two retarders and a beamsplitter. The retarders are modulated at ∼500 Hz, a frequency comparable to that of the variations of the refraction index in the Earth atmosphere, what is usually termed as seeing in astronomical observations. However, we assume that the camera is slow, so that our measurements are time integrations of these modulated signals. In order to recover the time variation of the seeing and obtain the Stokes parameters, we use the theory of compressed sensing to solve the demodulation by impose a sparsity constraint on the Fourier coefficients of the seeing. We demonstrate the feasibility of this sub-Nyquist polarimeter using numerical simulations, both in the case without noise and with noise. We show that a sensible modulation scheme is obtained by randomly changing the fast axis of the modulators or their retardances in specific ways. We finally demonstrate that the value of the Stokes parameters can be recovered with great precision at almost maximum efficiency, although it slightly degrades when the signal-to-noise ratio of the observations increase, a consequence of the multiplexing under the presence of photon noise.

Codes to reproduce the figures of this paper are available at: https://github.com/aasensio/randomModulator

[1] CS / Imaging With Nature: Is Turbulence Ripe for Compressive Imaging ?

Credit photo: NASA
H/T Tristan

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

CAI: Tank Implosion through Phase-based Video Motion Processing

Bill Freeman was talking yesterday at MIA, and we got to see his presentation on the technique used to zoom up on small changes in videos. I decided to try it on a YouTube video. You probably remember this blog entry entitled CAI: Tank Implosion through Robust PCA where we looked at a dramatic scene occuring in an industrial setting through the lens of robust PCA. There, there are numerous situations where tank venting is critical. In the right condition of temperature and the fluid inside, low pressure can set in thereby enabling the crushing of .the tank from the outside/ambiant pressure. Here is that dramatic example from two angles. The video on the left is the original video whereas the one of the right features the processed video with Bill et al's algorithm (see below for the paper and site). One can clearly vizualize the stress waves traveling all around the tank before it eventually buckles.

Phase-based Video Motion Processing by Neal Wadhwa, Michael Rubinstein, Fredo Durand, William T. Freeman

We introduce a technique to manipulate small movements in videos based on an analysis of motion in complex-valued image pyramids. Phase variations of the coefficients of a complex-valued steerable pyramid over time correspond to motion, and can be temporally processed and amplified to reveal imperceptible motions, or attenuated to remove distracting changes. This processing does not involve the computation of optical flow, and in comparison to the previous Eulerian Video Magnification method it supports larger amplification factors and is significantly less sensitive to noise. These improved capabilities broaden the set of applications for motion processing in videos. We demonstrate the advantages of this approach on synthetic and natural video sequences, and explore applications in scientific analysis, visualization and video enhancement.

The Project website and the playground where you can test your own videos is here.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, January 20, 2016

Learning A Task-Specific Deep Architecture For Clustering / D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images

Zhangyang/Atlas just sent me the following:

Dear Igor,

I hope this email finds you well.

Having been a fan of your blog since 2012, I would like to first say thanks, for introducing our deep l0 encoder work on your blog post. The piece of paper has been accepted by AAAI’16, in Phoenix, AZ, Feb 2016. I look forward to presenting and discussing the paper in depth with peers soon!

I would further like to bring to your attention two more pieces of our recent work, which could be viewed as successions of the deep l0 encoder:

1. "Learning A Task-Specific Deep Architecture For Clustering’’, accepted by SDM’16, manuscript available at: http://arxiv.org/abs/1509.00151

2. "D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images’’, under review, manuscript available at: http://arxiv.org/abs/1601.04149

The total of three papers tackle different applications. However, they share the same underlying theme, to reveal how the analytic tools of “shallow” models (mostly sparse coding) can be translated to guide the architecture design and improve the performance of deep models. More about the "big picture’’ could be found in a research statement, on my website.

Thanks, and best regards,

Zhangyang (Atlas) Wang

Homepage || LinkedIn

Thanks Atlas ! Here are the preprints:

Learning A Task-Specific Deep Architecture For Clustering
Zhangyang Wang, Shiyu Chang, Jiayu Zhou, Meng Wang, Thomas S. Huang

While sparse coding-based clustering methods have shown to be successful, their bottlenecks in both efficiency and scalability limit the practical usage. In recent years, deep learning has been proved to be a highly effective, efficient and scalable feature learning tool. In this paper, we propose to emulate the sparse coding-based clustering pipeline in the context of deep learning, leading to a carefully crafted deep model benefiting from both. A feed-forward network structure, named TAGnet, is constructed based on a graph-regularized sparse coding algorithm. It is then trained with task-specific loss functions from end to end. We discover that connecting deep learning to sparse coding benefits not only the model performance, but also its initialization and interpretation. Moreover, by introducing auxiliary clustering tasks to the intermediate feature hierarchy, we formulate DTAGnet and obtain a further performance boost. Extensive experiments demonstrate that the proposed model gains remarkable margins over several state-of-the-art methods.

D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images
Zhangyang Wang, Ding Liu, Shiyu Chang, Qing Ling, Thomas S. Huang

In this paper, we design a Deep Dual-Domain (D3) based fast restoration model to remove artifacts of JPEG compressed images. It leverages the large learning capacity of deep networks, as well as the problem-specific expertise that was hardly incorporated in the past design of deep architectures. For the latter, we take into consideration both the prior knowledge of the JPEG compression scheme, and the successful practice of the sparsity-based dual-domain approach. We further design the One-Step Sparse Inference (1-SI) module, as an efficient and light-weighted feed-forward approximation of sparse coding. Extensive experiments verify the superiority of the proposed D3 model over several state-of-the-art methods. Specifically, our best model is capable of outperforming the latest deep model for around 1 dB in PSNR, and is 30 times faster.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Nuit Blanche Referenced in the Dead Tree World!

The Big Picture in Compressive Sensing was mentioned in an article of La Recherche, the french speaking equivalent/competitor to Science. October 2010 issue, page 20-21.
Wired Magazine had a piece on Compressed Sensing featuring links to this blog and the Big Picture. (March 1, 2010)
Emmanuel Candes and Terry Tao wrote about Nuit Blanche in the Dec. '08 issue of the IEEE Information Theory Society Newsletter
Xiaochuan Pan, Emil Sidky and Michael Vannier wrote about Nuit Blanche in Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction?.
Check also the acknowledgments in this Ghost Imaging paper and this one.

Nuit Blanche

Page Views on Nuit Blanche since July 2010

Sunday, January 31, 2016

Nuit Blanche in Review (January 2016)

Saturday, January 30, 2016

Saturday Morning Video: Recent advances in deep learning, Oriol Vinyals, Google

Friday, January 29, 2016

Fast Integral Image Estimation at 1% measurement rate

ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements

Thursday, January 28, 2016

Random Feature Roundup

Wednesday, January 27, 2016

SpaRTaN-MacSeNet Spring School on Sparse Representations and Compressed Sensing 4-8 April 2016, Ilmenau, Germany

Tuesday, January 26, 2016

Multi-Processor Approximate Message Passing Using Lossy Compression

Monday, January 25, 2016

Job: PhD studentship and Postdocs, Fellowship program at IBM Research related to data science for social good

Sub-Sampled Newton Methods

Saturday, January 23, 2016

Saturday Morning Video: NIPS 2015 Tutorial on Deep Learning by Geoffrey E. Hinton, Yann LeCun, and Yoshua Bengio

Friday, January 22, 2016

Speckle-learning-based object recognition through scattering media

Thursday, January 21, 2016

Random sub-Nyquist polarimetric modulator - implementation -

CAI: Tank Implosion through Phase-based Video Motion Processing

Wednesday, January 20, 2016

Learning A Task-Specific Deep Architecture For Clustering / D3: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images

Printfriendly