Sunday, July 31, 2016

Nuit Blanche in Review (July 2016)

Much happened this past month since the last Nuit Blanche in Review (June 2016). We submitted a proposal for a two day workshop at NIPS on "Mapping Machine Learning to Hardware" (the list of proposed speakers is included in the post), we featured two proofs on the same problem from two different groups (the third proof being on a different problem altogether), one implementation, a survey and a video on randomized methods, some exciting results on random projections in manifolds and in practice with deep learning, a continuing set of posts showing the Great Convergence in action, a few applications centered posts and much more.... Enjoy. In the meantime, the image above shows the current state of reactor 2 at Fukushima Daiichi. That image uses Muon tomography and a reconstruction algorithm we mentioned back in Imaging Damaged Reactors and Volcanoes. Without further ado.

Implementation
Proofs


Thesis:
Survey:
Hardware and related
Slides:
In-Depth
Random Projections
Random Features
Connection between compressive sensing and Deep Learning
AMP/FrankWolfe/ Hashing/Deep Learning
Application centered

Videos:
Jobs:
LightOn:

 Credit photo: TEPCO
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, July 29, 2016

Stochastic Frank-Wolfe Methods for Nonconvex Optimization

All the previous blog entries around the Frank-Wolfe can be found under this Frank-Wolfe tag.



We study Frank-Wolfe methods for nonconvex stochastic and finite-sum optimization problems. Frank-Wolfe methods (in the convex case) have gained tremendous recent interest in machine learning and optimization communities due to their projection-free property and their ability to exploit structured constraints. However, our understanding of these algorithms in the nonconvex setting is fairly limited. In this paper, we propose nonconvex stochastic Frank-Wolfe methods and analyze their convergence properties. For objective functions that decompose into a finite-sum, we leverage ideas from variance reduction techniques for convex optimization to obtain new variance reduced nonconvex Frank-Wolfe methods that have provably faster convergence than the classical Frank-Wolfe method. Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved convergence rates for the stochastic setting.
h/t Atlas Wang 


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, July 28, 2016

The non-convex Burer-Monteiro approach works on smooth semidefinite programs / On the low-rank approach for semidefinite programs arising in synchronization and community detection



Vlad just sent me the following:

Hi Igor,

You may find of interest my new paper (joint work with Nicolas Boumal and Afonso Bandiera) on theoretical guarantees for the Burer Monteiro approach for smooth SDPs. We are able to show the following: consider a fixed set of m linear constraints for an SDP with compact domain, and the corresponding factorized BM formulation with rank constrained to be at most about sqrt(m). If the rank-constrained problem has a domain which is a smooth manifold, then generically in the cost matrix C of the SDP, this BM rank-constrained problem is non-convex in only a benign way, in that 2nd order critical points are global minimizers. We also establish quantitative bounds on the quality of candidate solution as a function of approximate 2nd order optimality and provide examples of applications and a numerical study.

http://arxiv.org/pdf/1606.04970.pdf


There is also a related paper with the same authors in which we analyze regimes of planted MaxCut and community detection. In these specific settings we provide much stronger guarantees, in that the BM factorization has no spurious local optima even at rank 2.


http://arxiv.org/pdf/1602.04426v2.pdf


Best,

-Vlad



Vladislav Voroninski
http://www.mit.edu/~vvlad/
Thank you Vlad



Semidefinite programs (SDP's) can be solved in polynomial time by interior point methods, but scalability can be an issue. To address this shortcoming, over a decade ago, Burer and Monteiro proposed to solve SDP's with few equality constraints via rank-restricted, non-convex surrogates. Remarkably, for some applications, local optimization methods seem to converge to global optima of these non-convex surrogates reliably. Although some theory supports this empirical success, a complete explanation of it remains an open question. In this paper, we consider a class of SDP's which includes applications such as max-cut, community detection in the stochastic block model, robust PCA, phase retrieval and synchronization of rotations. We show that the low-rank Burer-Monteiro formulation of SDP's in that class almost never has any spurious local optima.


To address difficult optimization problems, convex relaxations based on semidefinite programming are now common place in many fields. Although solvable in polynomial time, large semidefinite programs tend to be computationally challenging. Over a decade ago, exploiting the fact that in many applications of interest the desired solutions are low rank, Burer and Monteiro proposed a heuristic to solve such semidefinite programs by restricting the search space to low-rank matrices. The accompanying theory does not explain the extent of the empirical success. We focus on Synchronization and Community Detection problems and provide theoretical guarantees shedding light on the remarkable efficiency of this heuristic.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, July 27, 2016

Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinION (TM) sequencing

About four years ago, I tried to predict the future for August 25, 2030. In order to do this, I first mentioned The Steamrollers i.e. technologies that were exponential in nature. Nanopore sequencing was one of them. In the second installment, I mentioned different algorithms that could help in making sense of the data generated by these steamrollers (Predicting the Future: Randomness and Parsimony). Streaming was one of them. It is really no surprise, if like in hyperspectral imaging or nanopore sequencing you are producing a lot of data, your interest switch from the modeling aspect of things to how can it be helpful now and how fast.  How does it change science ? well you just need to read the following article:

The main contribution of this article is to demonstrate that despite the higher error rate, it is possible to return clinical actionable information, including species and strain identification from as few as 500 reads. We achieved this by developing novel approaches that are less sensitive to base-calling errors and which use whatever subset of genome-wide information is observed up to a point in time, rather than a panel of pre-defined markers or genes. For example, the strain typing presence/absence approach relies only on being able to identify homology to genes and also allows for a level of incorrect gene annotation.



Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinIONTMsequencing by Minh Duc Cao, Devika Ganesamoorthy, Alysha G. Elliott, Huihui Zhang, Matthew A. Cooper and Lachlan J.M. Coin
The recently introduced Oxford Nanopore MinION platform generates DNA sequence data in real-time. This has great potential to shorten the sample-to-results time and is likely to have benefits such as rapid diagnosis of bacterial infection and identification of drug resistance. However, there are few tools available for streaming analysis of real-time sequencing data. Here, we present a framework for streaming analysis of MinION real-time sequence data, together with probabilistic streaming algorithms for species typing, strain typing and antibiotic resistance profile identification. Using four culture isolate samples, as well as a mixed-species sample, we demonstrate that bacterial species and strain information can be obtained within 30 min of sequencing and using about 500 reads, initial drug-resistance profiles within two hours, and complete resistance profiles within 10 h. While strain identification with multi-locus sequence typing required more than 15x coverage to generate confident assignments, our novel gene-presence typing could detect the presence of a known strain with 0.5x coverage. We also show that our pipeline can process over 100 times more data than the current throughput of the MinION on a desktop computer.






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

gvnn: Neural Network Library for Geometric Computer Vision

I learned about Spatial transformers at a Deep Learning meetup in Paris from a presentation by Alban Desmaison (yes I sometimes do attend other awesome meetups). Spatial transformers are "aiming at boosting the geometric invariance of CNNs" Transformers are using transformations from computer vision and camera models to help deep neural networks learn better or faster. They do so by providing models behind generic invariances like rotation, translation etc.... It is a little bit unsettling since most defenders of deep neural architectures would rather add data than get help from vision  models :-) Anyway, let's see this as another instance of the Great Convergence where Machine Learning and Computer graphics, the old computer vision paradigm and signal processing meet. Today, we have a library written in Torch that describes several transformation for inclusion in neural networks. 

Let me just add that while these transformers do help neural networks in reducing the number of network coefficients, the remaining coefficients are probably summarizing many of the geometric invariances that we collectively have not yet discovered -this is a sense I got from a presentation by Stephane Mallat a while back when he mentioned the connection between deep neural networks and his scattering networks-. 

Unrelated but related to the transformers endeavor: 





gvnn: Neural Network Library for Geometric Computer Vision by Ankur Handa, Michael Bloesch, Viorica Patraucean, Simon Stent, John McCormac, Andrew Davison

We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between classic geometric computer vision and deep learning. Inspired by the recent success of Spatial Transformer Networks, we propose several new layers which are often used as parametric transformations on the data in geometric computer vision. These layers can be inserted within a neural network much in the spirit of the original spatial transformers and allow backpropagation to enable end-to-end learning of a network involving any domain knowledge in geometric computer vision. This opens up applications in learning invariance to 3D geometric transformation for place recognition, end-to-end visual odometry, depth estimation and unsupervised learning through warping with a parametric transformation for image reconstruction error.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, July 26, 2016

Dual Purpose Hashing





Recent years have seen more and more demand for a unified framework to address multiple realistic image retrieval tasks concerning both category and attributes. Considering the scale of modern datasets, hashing is favorable for its low complexity. However, most existing hashing methods are designed to preserve one single kind of similarity, thus improper for dealing with the different tasks simultaneously. To overcome this limitation, we propose a new hashing method, named Dual Purpose Hashing (DPH), which jointly preserves the category and attribute similarities by exploiting the Convolutional Neural Network (CNN) models to hierarchically capture the correlations between category and attributes. Since images with both category and attribute labels are scarce, our method is designed to take the abundant partially labelled images on the Internet as training inputs. With such a framework, the binary codes of new-coming images can be readily obtained by quantizing the network outputs of a binary-like layer, and the attributes can be recovered from the codes easily. Experiments on two large-scale datasets show that our dual purpose hash codes can achieve comparable or even better performance than those state-of-the-art methods specifically designed for each individual retrieval task, while being more compact than the compared methods.





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, July 25, 2016

Onsager-corrected deep learning for sparse linear inverse problems

Thomas let me know on Twitter that the Great Convergence continues, Today we find out how we go about changing the iterative process of AMP and then learn coefficients of that process as in Deep Learning. It looks like the Learned AMP beats LISTA. Looking back at the few COLT presentations earlier (Saturday videos), one wonders how these solvers change the rule of thumbs on model depth and size. To be continued.... 


Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem encountered in compressive sensing, where one seeks to recover a sparse signal from a small number of noisy linear measurements. In this paper, we propose a novel neural-network architecture that decouples prediction errors across layers in the same way that the approximate message passing (AMP) algorithm decouples them across iterations: through Onsager correction. Numerical experiments suggest that our "learned AMP" network significantly improves upon Gregor and LeCun's "learned ISTA" network in both accuracy and complexity.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, July 23, 2016

Saturday Morning Video: On the Expressive Power of Deep Learning: A Tensor Analysis, Nadav Cohen, Or Sharir, Amnon Shashua @ COLT2016







The preprint on which it relies is:


On the Expressive Power of Deep Learning: A Tensor Analysis by Nadav Cohen, Or Sharir, Amnon Shashua

It has long been conjectured that hypotheses spaces suitable for data that is compositional in nature, such as text or images, may be more efficiently represented with deep hierarchical networks than with shallow ones. Despite the vast empirical evidence supporting this belief, theoretical justifications to date are limited. In particular, they do not account for the locality, sharing and pooling constructs of convolutional networks, the most successful deep learning architecture to date. In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling. An equivalence between the networks and hierarchical tensor factorizations is established. We show that a shallow network corresponds to CP (rank-1) decomposition, whereas a deep network corresponds to Hierarchical Tucker decomposition. Using tools from measure theory and matrix algebra, we prove that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network. Since log-space computation transforms our networks into SimNets, the result applies directly to a deep learning architecture demonstrating promising empirical performance. The construction and theory developed in this paper shed new light on various practices and ideas employed by the deep learning community.
other recent work by Nadav include:

Convolutional Rectifier Networks as Generalized Tensor Decompositions by Nadav Cohen, Amnon Shashua

Convolutional rectifier networks, i.e. convolutional neural networks with rectified linear activation and max or average pooling, are the cornerstone of modern deep learning. However, despite their wide use and success, our theoretical understanding of the expressive properties that drive these networks is partial at best. On the other hand, we have a much firmer grasp of these issues in the world of arithmetic circuits. Specifically, it is known that convolutional arithmetic circuits possess the property of "complete depth efficiency", meaning that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be implemented (or even approximated) by a shallow network. In this paper we describe a construction based on generalized tensor decompositions, that transforms convolutional arithmetic circuits into convolutional rectifier networks. We then use mathematical tools available from the world of arithmetic circuits to prove new results. First, we show that convolutional rectifier networks are universal with max pooling but not with average pooling. Second, and more importantly, we show that depth efficiency is weaker with convolutional rectifier networks than it is with convolutional arithmetic circuits. This leads us to believe that developing effective methods for training convolutional arithmetic circuits, thereby fulfilling their expressive potential, may give rise to a deep learning architecture that is provably superior to convolutional rectifier networks but has so far been overlooked by practitioners.

Inductive Bias of Deep Convolutional Networks through Pooling Geometry by Nadav Cohen, Amnon Shashua
Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional arithmetic circuits to model correlations among regions of their input. Correlations are formalized through the notion of separation rank, which for a given input partition, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network's pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday Morning Video: Benefits of depth in neural networks, Matus Telgarsky @ COLT2016

 



Representation Benefits of Deep Feedforward Networks by Matus Telgarsky
This note provides a family of classification problems, indexed by a positive integer k, where all shallow networks with fewer than exponentially (in k) many nodes exhibit error at least 1/6, whereas a deep network with 2 nodes in each of 2k layers achieves zero error, as does a recurrent network with 3 distinct nodes iterated k times. The proof is elementary, and the networks are standard feedforward networks with ReLU (Rectified Linear Unit) nonlinearities.
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday Morning Video: The Power of Depth for Feedforward Neural Networks by Ohad Shamir @ COLT2016


 
The attendant paper is here:


The Power of Depth for Feedforward Neural Networks by Ronen Eldan, Ohad Shamir

We show that there is a simple (approximately radial) function on $\reals^d$, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.

 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday Morning Video: Randomized Algorithms in Linear Algebra, Ravi Kannan @ COLT2016

 


As Sebastien pointed out the COLT 2016 videos are out. Here is one: Ravi Kannan on Randomized Algorithms in Linear Algebra 








 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, July 22, 2016

Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Recommendations Tasks - implementation -



Ivan, the person behind the Tensor Train tensor decomposition just sent me the following:


Dear Igor,  
I have a new interesting paper to share. " Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Recommendations Tasks"
http://arxiv.org/abs/1607.04228
(accepted at ACM RecSys 2016).
A framework is also available: https://github.com/Evfro/polaraThe key idea is to introduce a tensor from user-item-rating, thus being able to recommend even from a negative feedback.

With best wishes,
Ivan.

Thanks Ivan !


Fifty Shades of Ratings: How to Benefit from a Negative Feedback in Top-N Recommendations Tasks by Evgeny Frolov, Ivan Oseledets
Conventional collaborative filtering techniques treat a top-n recommendations problem as a task of generating a list of the most relevant items. This formulation, however, disregards an opposite - avoiding recommendations with completely irrelevant items. Due to that bias, standard algorithms, as well as commonly used evaluation metrics, become insensitive to negative feedback. In order to resolve this problem we propose to treat user feedback as a categorical variable and model it with users and items in a ternary way. We employ a third-order tensor factorization technique and implement a higher order folding-in method to support online recommendations. The method is equally sensitive to entire spectrum of user ratings and is able to accurately predict relevant items even from a negative only feedback. Our method may partially eliminate the need for complicated rating elicitation process as it provides means for personalized recommendations from the very beginning of an interaction with a recommender system. We also propose a modification of standard metrics which helps to reveal unwanted biases and account for sensitivity to a negative feedback. Our model achieves state-of-the-art quality in standard recommendation tasks while significantly outperforming other methods in the cold-start "no-positive-feedback" scenarios.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, July 20, 2016

Random projections of random manifolds

Looks like some bounds have been tightened. Funny enough, in a matter of a week, I am either talking to or featuring someone from SpaceX, which just happens to deliver today or tomorrow a nanopore sequencer to the international space station

Small worlds!

Anyway, getting back to the paper.



Random projections of random manifolds by Subhaneil Lahiri, Peiran Gao, Surya Ganguli
A ubiquitous phenomenon is that interesting signals or data concentrate on low dimensional smooth manifolds inside a high dimensional ambient Euclidean space. Random projections are a simple and powerful tool for dimensionality reduction of such signals and data. Previous, seminal works have studied bounds on how the number of projections needed to preserve the geometry of these manifolds, at a given accuracy, scales with the intrinsic dimensionality, volume and curvature of the manifold. However, such works employ definitions of volume and curvature that are inherently difficult to compute. Therefore such theory cannot be easily tested against numerical simulations to quantitatively understand the tightness of the proven bounds. We instead study the typical distortions arising in random projections of an ensemble of smooth Gaussian random manifolds. In doing so, we find explicitly computable, approximate theoretical bounds on the number of projections required to preserve the geometric structure of these manifolds to a prescribed level of accuracy. Our bounds, while approximate, can only be violated with a probability that is exponentially small in the ambient dimension, and therefore they hold with high probability in most cases of practical interest. Moreover, unlike previous work, we test our theoretical bounds against numerical experiments on the actual geometric distortions that typically occur for random projections of random smooth manifolds. Through this comparison, we find our bounds are tighter than previous results by several orders of magnitude.



 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, July 19, 2016

Our proposal for a NIPS workshop on "Mapping Machine Learning to Hardware" has been submitted !



You probably have noticed recently an acceleration of the offering when it comes to computing hardware in Machine Learning. You may have also noticed that some algorithms seem to take into account the constraints of being in the binary world or that new and innovative technologies were trying to get in that area without knowing for sure what primitives they were improving. You might even have worndered what sort of technology will bring us closer to better efficiencies while still being flexible in terms of algorithm design. Eventually, as algorithm designer, you mmight have wondered about the sort of constraints your algorithm should be fulfilling. All these questions and many others are the reasons for this recent announcement on Nuit Blanche ( A proposal for a NIPS Workshop or Symposium on "Mapping Machine Learning to Hardware").With a truly outstanding set of people, we have been able to get the commitment from the following people as plenary speakers so that they can enlighten us about what is going on when it comes to the mapping for Machine Learning to hardware and more importantly what is the State of the Possible:


Here is the final proposal we sent out last night to the gods of NIPS.

With the recent success of Deep Learning and related techniques, we are beginning to see new specialized hardware or extensions to existing architectures dedicated for making training and inference computations faster or energy efficient or both. These technologies use either traditional CMOS technology on conventional von-Neumann architectures such as CPUs or accelerators such as DSPs, GPUs, FPGAs, and ASICs or other novel and exotic technologies in research phase such as  Memristors, Quantum Chips,  and Optical/Photonics. The overarching goal being to address a specific trade-off in mapping machine learning algorithms in general and deep learning in particular, to a specific underlying hardware technology.

Conversely, there has been quite an effort on the empirical side at devising deep network architectures for efficient implementation on low complexity hardware via low-rank tensor factorizations, structured matrix approximations, lower bit-depth like binary coefficients, compression and pruning to name a few approaches. This also has implications on leveraging appropriate hardware technology for inferencing primarily with energy and latency as the primary design goals.

These efforts are finding some traction in the signal processing and sparse/compressive sensing community to map the first layers of sensing hardware with the first layers of models such as deep networks. This approach has the potential of changing the way  sensing hardware, image reconstruction, signal processing and image understanding will be performed in the future.

This workshop aims to tie these seemingly disparate themes of co-design, architecture, algorithms and signal processing and bring together researchers at the interface of machine learning, hardware implementation, sensing, physics and biology for discussing the state of the art and the state of the possible.

The goals of the workshop are
  • To present how machine learning computations and algorithms are mapped and co-designed with new hardware technologies and architectures
  • To stimulate discussions on how machine learning algorithms can be co-designed or co-optimized with the underlying hardware technology to best take advantage of the synergy
  • To evaluate the different trade-offs in accuracy, computational complexity, hardware cost, energy efficiency and application throughput currently investigated in these approaches
  • To understand how data acquisition frameworks may be transformed to better interface with machine Learning algorithms
  • To evaluate the constraints put forth on recent deep learning architectures so as to reduce redundancy and enable a simpler mapping between computing hardware and models.
  • To enable a wider discussion on how Machine Learning algorithms and their primitives may provide a path toward exotic technology insertion in industrial roadmaps.


Besides the presentations made by the plenary speakers, there will be a poster session, lightning talks selected from the posters and a roundtable at the conclusion of the workshop. The workshop will be taped and presentations slides will be made available online. A white paper describing the talks and the discussions that went on during the workshop will be made available after the meeting.
Thanks to the following co-organizers, this proposal became a reality:
and a few other friends who transmitted the call inside their organizations.   Thank you !
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

DeepBinaryMask: Learning a Binary Mask for Video Compressive Sensing

  The Great Convergence continues in compressive sensing hardware and machine learning:


DeepBinaryMask: Learning a Binary Mask for Video Compressive Sensing by Michael Iliadis, Leonidas Spinoulas, Aggelos K. Katsaggelos
In this paper, we propose a novel encoder-decoder neural network model referred to as DeepBinaryMask for video compressive sensing. In video compressive sensing one frame is acquired using a set of coded masks (sensing matrix) from which a number of video frames is reconstructed, equal to the number of coded masks. The proposed framework is an end-to-end model where the sensing matrix is trained along with the video reconstruction. The encoder learns the binary elements of the sensing matrix and the decoder is trained to recover the unknown video sequence. The reconstruction performance is found to improve when using the trained sensing mask from the network as compared to other mask designs such as random, across a wide variety of compressive sensing reconstruction algorithms. Finally, our analysis and discussion offers insights into understanding the characteristics of the trained mask designs that lead to the improved reconstruction quality.
 
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, July 18, 2016

Single-shot diffuser-encoded light field imaging

Oh ! A new compressive sensing hardware, woohoo !

Single-shot diffuser-encoded light field imaging by Nicholas Antipa , Sylvia Necula , Ren Ng , Laura Waller

We capture 4D light field data in a single 2D sensor image by encoding spatio-angular information into a speckle field (causticpattern) through a phase diffuser. Using wave-optics theory and a coherent phase retrieval method, we calibrate the system by measuring the diffuser surface height from through-focus images. Wave-optics theory further informs the design of system geometry such that a purely additive ray-optics model is valid. Light field reconstruction is done using nonlinear matrix inversion methods, including l_1 minimization. We demonstrate a prototype system and present empirical results of 4D light field reconstruction and computational refocusing from a single diffuser-encoded 2D image.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

An approximate message passing approach for compressive hyperspectral imaging using a simultaneous low-rank and joint-sparsity prior

ah ! Using AMP in hyperspectral imaging with sparsity and low rank, this was much needed.
 


An approximate message passing approach for compressive hyperspectral imaging using a simultaneous low-rank and joint-sparsity prior by Yangqing Li, Saurabh Prasad, Wei Chen, Changchuan Yin, Zhu Han

This paper considers a compressive sensing (CS) approach for hyperspectral data acquisition, which results in a practical compression ratio substantially higher than the state-of-the-art. Applying simultaneous low-rank and joint-sparse (L&S) model to the hyperspectral data, we propose a novel algorithm to joint reconstruction of hyperspectral data based on loopy belief propagation that enables the exploitation of both structured sparsity and amplitude correlations in the data. Experimental results with real hyperspectral datasets demonstrate that the proposed algorithm outperforms the state-of-the-art CS-based solutions with substantial reductions in reconstruction error.
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly