Conventional imaging uses a set of lenses to form an image on the sensor plane. This pure hardware-based approach doesn't use any signal processing, nor the extra information in the time of arrival of photons to the sensor. Recently, modern compressive sensing techniques have been applied for lensless imaging. However, this computational approach tends to depend as much as possible on signal processing (for example, single pixel camera) and results in a long acquisition time. Here we propose using compressive ultrafast sensing for lensless imaging. We use extremely fast sensors (picosecond time resolution) to time tag photons as they arrive to an omnidirectional pixel. Thus, each measurement produces a time series where time is a function of the photon source location in the scene. This allows lensless imaging with significantly fewer measurements compared to regular single pixel imaging (33× less measurements in our experiments). To achieve this goal, we developed a framework for using ultrafast pixels with compressive sensing, including an algorithm for ideal sensor placement, and an algorithm for optimized active illumination patterns. We show that efficient lensless imaging is possible with ultrafast imaging and compressive sensing. This paves the way for novel imaging architectures, and remote sensing in extreme situations where imaging with a lens is not possible.
Page Views on Nuit Blanche since July 2010
Nuit Blanche community
@NuitBlog || Facebook || Reddit
Compressive Sensing on LinkedIn
Advanced Matrix Factorization on Linkedin ||
Monday, October 31, 2016
Lensless Imaging with Compressive Ultrafast Sensing
Here is some very fast compressive sensing using coded aperture.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
Wednesday, October 26, 2016
Practical Learning of Deep Gaussian Processes via Random Fourier Features
Practical Learning of Deep Gaussian Processes via Random Fourier Features by Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone
The composition of multiple Gaussian Processes as a Deep Gaussian Process (DGP) enables a deep probabilistic approach to flexibly quantify uncertainty and carry out model selection in various learning scenarios. In this work, we introduce a novel formulation of DGPs based on random Fourier features that we train using stochastic variational inference. Our proposal yields an efficient way of training DGP architectures without compromising on predictive performance. Through a series of experiments, we illustrate how our model compares favorably to other state-of-the-art inference methods for DGPs for both regression and classification tasks. We also demonstrate how an asynchronous implementation of stochastic gradient optimization can exploit the computational power of distributed systems for large-scale DGP learning.
Tuesday, October 25, 2016
Thesis: Spectral Inference Methods on Sparse Graphs: Theory and Applications by Alaa Saade
Congratulations Dr. Saade !
Spectral Inference Methods on Sparse Graphs: Theory and Applications by Alaa Saade
In an era of unprecedented deluge of (mostly unstructured) data, graphs are proving more and more useful, across the sciences, as a flexible abstraction to capture complex relationships between complex objects. One of the main challenges arising in the study of such networks is the inference of macroscopic, large-scale properties affecting a large number of objects, based solely on the microscopic interactions between their elementary constituents. Statistical physics, precisely created to recover the macroscopic laws of thermodynamics from an idealized model of interacting particles, provides significant insight to tackle such complex networks.
In this dissertation, we use methods derived from the statistical physics of disordered systems to design and study new algorithms for inference on graphs. Our focus is on spectral methods, based on certain eigenvectors of carefully chosen matrices, and sparse graphs, containing only a small amount of information. We develop an original theory of spectral inference based on a relaxation of various mean-field free energy optimizations. Our approach is therefore fully probabilistic, and contrasts with more traditional motivations based on the optimization of a cost function. We illustrate the efficiency of our approach on various problems, including community detection, randomized similarity-based clustering, and matrix completion.
Monday, October 24, 2016
A Greedy Blind Calibration Method for Compressed Sensing with Unknown Sensor Gains
A Greedy Blind Calibration Method for Compressed Sensing with Unknown Sensor Gains by Valerio Cambareri, Laurent Jacques
The realisation of sensing modalities based on the principles of compressed sensing is often hindered by discrepancies between the mathematical model of its sensing operator, which is necessary during signal recovery, and its actual physical implementation, whose values may differ significantly from the assumed model. In this paper we tackle the bilinear inverse problem of recovering a sparse input signal and some unknown, unstructured multiplicative factors affecting the sensors that capture each compressive measurement. Our methodology relies on collecting a few snapshots under new draws of the sensing operator, and applying a greedy algorithm based on projected gradient descent and the principles of iterative hard thresholding. We explore empirically the sample complexity requirements of this algorithm by testing the phase transition of our algorithm, and show in a practically relevant instance of compressive imaging that the exact solution can be obtained with only a few snapshots.
Sunday, October 23, 2016
Sunday Morning Insight: We're the Barbarians
From Appendix A of "Rebooting the IT Revolution: A Call to Action"
In a recent blog entry ( Predicting the Future: The Steamrollers and Machine Learning) I pointed out the current limits on the use of silicon for computing. Even though the predictions show the substantial impact of computing on power generation, there is only a scattered set of initiatives or technology development that are looking into this issue.
This was reinforced when we, at LightOn, recently filled a form to join Optics Valley, a non-profit group representing the interest of the Optics industry here in France.. Many of our answers fell into the "Other" category. That feeling was very much reinforced last night when I watched the IEEE rebooting computing video that features a set of initiatives that aims at solving this exact problem. But if you watch the short video, you'll probably notice that our technology also falls in the "Others" category.
Friday, October 21, 2016
Job: PhD Studentships, TU Delft / Online Optimization with Costly and Noisy Measurements using Random Fourier Expansions
Sander just sent me the following:
Dear Igor,
I have two vacancies for PhD students in
Applied Nonlinear Fourier Analysis for Fiber-Optic Communication / Water Wave Analysis
here at TU Delft that I hope might be of interest to some of your readers. More information can be found in the flyer at
http://www.dcsc.tudelft.nl/~swahls/pdf/PhD_Positions_NEUTRINO.pdf
It would be great if you could post them in your (fantastic!) blog.
Best, Sander
--
Dr.-Ing. Sander Wahls
Assistant Professor at TU Delft
http://www.dcsc.tudelft.nl/~swahls
So you'd think that Sander is just flattering me and the blog into getting a post out to hire PhD students but you'd be wrong. He does very interesting work, check this recent one:
Online Optimization with Costly and Noisy Measurements using Random Fourier Expansions by Laurens Bliek, Hans R. G. W. Verstraete, Michel Verhaegen, Sander Wahls
This paper analyzes DONE, an online optimization algorithm that iteratively minimizes an unknown function based on costly and noisy measurements. The algorithm maintains a surrogate of the unknown function in the form of a random Fourier expansion (RFE). The surrogate is updated whenever a new measurement is available, and then used to determine the next measurement point. The algorithm is comparable to Bayesian optimization algorithms, but its computational complexity per iteration does not depend on the number of measurements. We derive several theoretical results that provide insight on how the hyper-parameters of the algorithm should be chosen. The algorithm is compared to a Bayesian optimization algorithm for a benchmark problem and three applications, namely, optical coherence tomography, optical beam-forming network tuning, and robot arm control. It is found that the DONE algorithm is significantly faster than Bayesian optimization in the discussed problems, while achieving a similar or better performance.
Thursday, October 20, 2016
Paris Machine Learning, Hors série #3: Mathématiques et Data Science
Today, we will have a special kind of meetup which will be an open question roundtable around the following people (see below). The subject will naturally be around Mathematics and Data Science. It will most probably be in French (questions can be asked in English obviously):
Cédric Villani, Vincent Lefieux, Philippe Azoulay, Mathilde Mougeot, Julie Josse, Nicolas Le Roux
Questions can be asked on Twitter under the #MLParis tag.
The meetup is in part organized thanks to Quantmetry and in conjunction with the Mathematiques, Oxygene du numerique event. This is the streaming video but we should eventually have a more professional one later.
Cédric Villani, Vincent Lefieux, Philippe Azoulay, Mathilde Mougeot, Julie Josse, Nicolas Le Roux
Questions can be asked on Twitter under the #MLParis tag.
The meetup is in part organized thanks to Quantmetry and in conjunction with the Mathematiques, Oxygene du numerique event. This is the streaming video but we should eventually have a more professional one later.
Wednesday, October 19, 2016
Random Projections for Scaling Machine Learning in Hardware
Conitinuing in our Mapping ML to Hardware series, here is a way to produce random projections different than the way we do at LightOn.
Random Projections for Scaling Machine Learning in Hardware by Sean Fox, Stephen Tridgell, Craig Jin and Philip H.W. Leong
Random projections have recently emerged as a powerful technique for large scale dimensionality reduction in machine learning applications. Crucially, the randomness can be extracted from sparse probability distributions, enabling hardware implementations with little overhead. In this paper, we describe a Field-Programmable Gate Array (FPGA) implementation alongside a Kernel Adaptive Filter (KAF) that is capable of reducing computational resources by introducing a controlled error term, achieving higher modelling capacity for given hardware resources. Empirical results involving classification, regression and novelty detection show that a 40% net increase in available resources and improvements in prediction accuracy is achievable for projections which halve the input vector length, enabling us to scale-up hardware implementations of KAF learning algorithms by at least a factor of 2. Execution time of our random projection core is shown to be an order of magnitude lower than a single core central processing unit (CPU) and the system-level implementation on a FPGA-based network card achieves a 29x speedup over the CPU.
Tuesday, October 18, 2016
Thesis: Fast Randomized Algorithms for Convex Optimization and Statistical Estimation by Mert Pilanci
Here is 230 pages of goodness from Dr. Pilanci ! Fast Randomized Algorithms for Convex Optimization and Statistical Estimation by Mert Pilanci
With the advent of massive datasets, statistical learning and information processing techniques are expected to enable exceptional possibilities for engineering, data intensive sciences and better decision making. Unfortunately, existing algorithms for mathematical optimization, which is the core component in these techniques, often prove ineffective for scaling to the extent of all available data. In recent years, randomized dimension reduction has proven to be a very powerful tool for approximate computations over large datasets. In this thesis, we consider random projection methods in the context of general convex optimization problems on massive datasets. We explore many applications in machine learning, statistics and decision making and analyze various forms of randomization in detail. The central contributions of this thesis are as follows:
(i) We develop random projection methods for convex optimization problems and establish fundamental trade-offs between the size of the projection and accuracy of solution in convex optimization.
(ii) We characterize information-theoretic limitations of methods that are based on random projection, which surprisingly shows that the most widely used form of random projection is, in fact, statistically sub-optimal.
(iii) We present novel methods, which iteratively refine the solutions to achieve statistical optimality and enable solving large scale optimization and statistical inference problems orders-of-magnitude faster than existing methods.
(iv) We develop new randomized methodologies for relaxing cardinality constraints in order to obtain checkable and more accurate approximations than the state of the art approaches.
Monday, October 17, 2016
Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data
Machine Learning has gained great empirical success from the massive data sets collected from various domains. Among them a major challenge is to utilize existing computational resources to build predictive and inferential models from such huge data sets, while maintaining the statistical power of big data. One remedy for the big data challenge is to build distributed computer systems and design distributed learning algorithms to make big data learning possible, however, distributed systems may not always available, and the cost of running distributed system can be much higher than one can afford, which makes distributed learning not suitable for all scenarios. An alternative remedy is to use the state-of-the-art randomized optimization algorithms to accelerate the training process, for example, researchers have proposed optimization algorithms for regularized empirical risk minimization problem, with provable fast convergence and low computational cost per iteration (see (Johnson and Zhang, 2013; Shalev-Shwartz and Zhang, 2013; Defazio et al., 2014) for examples), however, the speed of these optimization methods still heavily depends on the condition number of problem at hand, which can be undesirable for many real world problems. Sketching (Woodruff, 2014), which approximates the solution via constructing some sketched, usually of smaller scale problem from the original data, has become an emerging technique for big data analytics. With the sketching technique, we can find solutions which approximately solve various forms of original large-scale problem, such as least square regression, robust regression, low-rank approximation, singular value decomposition, just to name a few. For survey and recent advances about sketching, we refer the readers to (Halko et al., 2011; Mahoney, 2011; Lu et al., 2013; Alaoui and Mahoney, 2014; Woodruff, 2014; Raskutti and Mahoney, 2015; Yang et al., 2015a; Oymak et al., 2015; Oymak and Tropp, 2015; Drineas and Mahoney, 2016) and references therein.
However, one major drawback of sketching is that typically it’s not suitable for the case if we want high accurate solution: to obtain a solution with exponentially smaller approximation error, we often need to increase the sketching dimension also exponentially.
The situation has become better with recent work on “iterative sketch”, e.g. iterative Hessian sketch (IHS) (Pilanci and Wainwright, 2016) and iterative dual random projection (IDRP) (Zhang et al., 2014). These methods are able to refine their approximate solution by iteratively solving some small scale sketched problem. Among these innovations, Hessian sketch (Pilanci and Wainwright, 2016) is designed by reducing the sample size of the original problem, while dual random projection (Zhang et al., 2014) is proposed by reducing the dimension. As a consequence, when the sample size and feature dimension are both large, IHS and IDRP still need to solve relatively large-scale subproblems as they can only sketch the problem from one perspective. In this paper, we make the following improvement upon previous work: we first propose an accelerated version of IHS which requires the same computational cost to solve the IHS subproblem at each sketching iteration, while with provably fewer number of sketching iterations to reach certain accuracy; we then reveal the primal-dual connections between IHS (Pilanci and Wainwright, 2016) and IDRP (Zhang et al., 2014), which are independently proposed by two different groups of researchers. In particular, we show that these two methods are equivalent in the sense that dual random projection is performing Hessian sketch in the dual space. Finally, to alleviate the computational issues raised by big and high-dimensional learning problems, we propose a primal-dual sketching method that can simultaneously reduce the sample size and dimension of the sketched sub-problem, with provable convergence guarantees.
Here is the paper: Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data by Jialei Wang, Jason D. Lee, Mehrdad Mahdavi, Mladen Kolar, Nathan Srebro
Sketching techniques have become popular for scaling up machine learning algorithms by reducing the sample size or dimensionality of massive data sets, while still maintaining the statistical power of big data. In this paper, we study sketching from an optimization point of view: we first show that the iterative Hessian sketch is an optimization process with preconditioning, and develop accelerated iterative Hessian sketch via the searching the conjugate direction; we then establish primal-dual connections between the Hessian sketch and dual random projection, and apply the preconditioned conjugate gradient approach on the dual problem, which leads to the accelerated iterative dual random projection methods. Finally to tackle the challenges from both large sample size and high-dimensionality, we propose the primal-dual sketch, which iteratively sketches the primal and dual formulations. We show that using a logarithmic number of calls to solvers of small scale problem, primal-dual sketch is able to recover the optimum of the original problem up to arbitrary precision. The proposed algorithms are validated via extensive experiments on synthetic and real data sets which complements our theoretical results.
Job: Postdoc at Ecole Normale Supérieure (ENS Paris), France
Florent wants me to post this postdoc announcement here:
Laplace Junior Professor Chair in Data Science
Postdoctoral Program
The Ecole Normale Supérieure (ENS Paris) invites applications for a Junior Research and Teaching Laplace chair in data science at the postdoctoral level, funded by CFM (Capital Fund Management) and the ENS. The chair is named after Pierre-Simon, marquis de Laplace, who between many accomplishments, was one of the early founders of statistical inference and data science.
The Laplace chair aims at recruiting outstanding candidates in all areas of data sciences including theoretical statistics, machine learning, signal processing, computer science, applied mathematics and statistical physics, or working on applications to other sciences such as physics, biology, medecine or social sciences and economics.
Appointments will be for two years with a possible extension for a third year. Salary is competitive and the positions are endowed with travel resources.
The successful candidate will carry out research in ENS, with reduced teaching duties which will be adapted. Applications should consist of a single file and be send before November 30th, 2016 by email to laplacechair2017@gmail.com.
More information about the scientific environnement of this program can be found on the webpage of the Data Science Chaire of the ENS at https://data-ens.github.io.
- A cover letter ;
- A complete CV including a list of publications ;
- A research statement (maximum 4 pages in A4 format) taking into account possible interactions with research groups/faculty within the different department of ENS (Computer science,Mathematics, Physics,Biology, etc.) ;
- Three letters of recommendation from senior scientists, to be sent directly by email to laplacechair2017@gmail.com.
Short-listed candidates will be invited for an interview (video conference) in mid-January 2017.
Sunday, October 16, 2016
Sunday Morning Insight: Machine Learning in Paris this past week
The most fascinating experience we've had this past week was the meetup we organized at Vente-privée one of the largest electronic store operation in Europe. It was fascinating on many grounds. First, we got to discover the business from one of the presentation by their engineers and then got a tour of the operations by Julien, the CTO. Their growth is so amazing that they now have to use machine learning to scale up not just for their operations but also for their customers. Their CTO mentioned a few facts that got our attention: 100 000 packages delivered everyday, 20 artists in residence that produces the jingles for their stores, more than 200 electronic stores created per month. The shooting of products require fashion models but at some point their operations was so large that all fashion models in Paris were booked (except for a few tops) with them. The situation led to them to creating photorealistic rendered version of new models for their stores/campaigns. They are about to open a few R&D labs at the Epita and 42 schools and they have a lots of very interesting problems. In a way they reminded me a little bit the situation described by Chris at the New York Times, Andrei at WalmartLabs a while back (see presentations in the archives section of the meetup). The meetup itself was somewhat different as well as regards to the presentations we had: Greg spoke to us about trying grab our interaction on social networks and use this to enhance our personnalities. This is an open project and the site is here: people2vec. Arnaud did the very unusual thing of telling us how he did to get the best actionable dataset for his Deep Learning start-up (Regaind.io). Olivier, Ivan and Antoine detailed some of the ML work at Vente-privée, Frederico talked to us about health data on the web. We also opened a small debate with François that got some reaction from the crowd. Eventually, Clementine also mentioned a Startup Weekend on AI. All the presentations and the video of the streaming is here. This coming week, we should have a new 'Hors série' meetup organized with Quantmetry with Cedric Villani, a Field's medalist among other speakers. I am not quite sure what the format will be but you can register here to attend. As usual, it's free.
The day after the meetup, LightOn got to pitch in the semi-finals of the Hello Tomorrow Challenge. The winner of this year's edition is a flying car.
Friday, October 14, 2016
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization - implementation -
Ramki just sent me the following:
Thanks Ramki ! Here is a presentation and the preprint Ramki mentions: MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization by Ramakrishnan Kannan, Grey Ballard, Haesun ParkDear Igor,We have recently opensourced our distributed NMF Library at https://github.com/ramkikannan/nmflibrary that implements different NMF algorithms such as Multiplicative Update(MU), Hierarchical Alternating Least Squares(HALS) and active set based ANLS/BPP on MPI and OpenMP. Some latest benchmarking of these algorithms on Oak Ridge Leadership Computing Facility (OLCF) super computers are presented at https://arxiv.org/abs/1609. 09154. Kindly look at the paper for the topic modeling results of entire stack overflow’s 12Million questions and experiments with graph clustering of the biggest sparse matrix of size 118M from Florida Sparse Matrix Collection. Can you please post this information in your community - matrix factorization jungle, compressed sensing google plus group and other interested communities. Let us know if you are looking for more information.RamkiRegards,
Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factorsW andH , for the given input matrixA , such thatA≈WH . NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets.
The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms that iteratively solves alternating non-negative least squares (NLS) subproblems forW andH . It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans for few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.
Thursday, October 13, 2016
Hybrid computing using a neural network with dynamic external memory
Hybrid computing using a neural network with dynamic external memory by Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, Sergio Gómez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, Adrià Puigdomènech Badia, Karl Moritz Hermann, Yori Zwols, Georg Ostrovski, Adam Cain, Helen King, Christopher Summerfield, Phil Blunsom, Koray Kavukcuoglu and Demis Hassabis
Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. Here we introduce a machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Like a conventional computer, it can use its memory to represent and manipulate complex data structures, but, like a neural network, it can learn to do so from data. When trained with supervised learning, we demonstrate that a DNC can successfully answer synthetic questions designed to emulate reasoning and inference problems in natural language. We show that it can learn tasks such as finding the shortest path between specified points and inferring the missing links in randomly generated graphs, and then generalize these tasks to specific graphs such as transport networks and family trees. When trained with reinforcement learning, a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols. Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read–write memory.
Wednesday, October 12, 2016
Paris Machine Learning Meetup #2 Season 4: Emotional AI, Regaind, Health Knowledge....
So today is Paris Machine Learning Meetup #2 Season 4 and it'll be hosted and sponsored by Vente-privée, woohoo ! Tonight we'll be talking about Emotional AI, Training set for image in Deep Learning, Health Knowledge, ML at Vente-Privée....and we will have a new feature. A debate of sorts. The streaming video is below and the presentation should be available before the meetup.
Voici le programme pour l'instant:
- 18:30 Franck Bardol, Igor Carron, intro (newsletter October 2016)
- 18:45 Clementine Delphin, Startup Weekend Artificial Intelligence (slides pdf version, 2 mins)
- 18:55 Olivier Dolle, Ivan Vukic, Antoine Deblonde Vente-privée, Machine Learning at Vente-Privée,
- 19:15 Frederico Quintao, Health Knowledge Framework for the Web,
In this presentation, I will show a framework for the classification of medical knowledge on the web, and where some current ML approaches to learn public health metrics from query logs/docs fit in the framework. This is somehow tied to my work at Google, where I was the global eng lead of the Health Search team for almost 5 years.
- 19:35 Gregory Renard, Xbrain, "Emotional AI", http://www.people2vec.org/
- 20:00 Arnaud Laurenty, Regaind.io What I wish a ninja data scientist had told me before we started building our datasets :)
Building machine learning models starts by generating a high quality dataset adapted to your task. Sometimes, you get all the data from a service that is already in production, and you mostly need to analyze it and clean it thoroughly. Sometimes, you have nothing at all, or you decide that you want to start everything from scratch.At Regaind, we have built an artificial intelligence that understands what photos matter to people in terms of content, action and aesthetic quality. You may try it out by having fun a virtual photo coach at https://keegan.regaind.io :)Creating our datasets has been a painful, expensive and time-consuming experience. We've paid for 20,000 hours of manual labelling and we've done mistakes along the way. We've worked with employees in CDD, with a crowdsourcing platform, and with offshore partners. In this totally unglamorous talk, we'll provide you with a humble feedback on the whole process, hoping that our experience will reduce your pain: cost, speed, quality, timing, legal issues, best practices...
- 20:20 Debate with François Némo, "Pourquoi l'Europe est-elle absente de la guerre des plateformes ?" (pdf version)
Tuesday, October 11, 2016
Gaussian graphical models with skggm - implementation -
Jason just sent me the following:
Sure Jason ! The page starts with:Hi Igor,
I hope this message finds you well. I saw you recently ventured into startup world, fun!
I'm writing to let you know about a new python package that Manjari Narayan and I recently published and thought it might be relevant for the Nuit Blanche community. If so, we'd love for you to post about it. Links below.
tour: https://jasonlaska.github.io/skggm/tour
code: https://github.com/jasonlaska/skggm
cheers, Jason
Gaussian graphical models with skggm
The rest is here.Graphical models combine graph theory and probability theory to create networks that model complex probabilistic relationships. Inferring such networks is a statistical problem in areas such as systems biology, neuroscience, psychometrics, and finance.
Jobs: Two Postdocs, Statistical Mechanics of Learning, Institute of Theoretical Physics (IPhT), CEA Saclay, France
Lenka just got an ERC. Congratulations ! She just asked me if I could run this announcement for two postdocs in her group. Yes, I certainly can, here is the announcement:
Opening of two postdoctoral positions on the project Statistical Mechanics of Learning
Scope of the project: Computers are now able to recognize people, to tell a dog from a cat, or to process speech so efficiently that they can answer complicated questions. This was still impossible only a decade ago. This progress is largely due to the development of the artificial “deep-learned neural networks”. Nowadays, “deep learning” is revolutionizing our life, prompting an economic battle between internet giants, creation of a myriad of start-ups and inspires many to dream about artificial intelligence in a way that previously only appeared in science fiction novels.
As attractive and performant as this is, however, many agree that deep learning is largely an empirical field that lacks a theoretical understanding of its capacity and limitations. The algorithms used to "train" these networks explore a very complex and non-convex energy landscape that eludes most of the present theoretical methodology in statistics. The behavior of the dynamics in such complicated "glassy" landscape is, however, similar to those that have been studied for decades in the physics of disordered systems such as molecular and spin glasses.
In this project we pursue this analogy and use advanced methods of disordered systems such to develop a statistical mechanics approach to deep neural networks. The goal is to bring theoretical understanding of the principles behind the empirical success of deep neural networks. We use analytic and algorithmic methods (replica, cavity method, message passing) originating in the research of spin glasses and the physics-based strategy of studying exactly solvable simplified models. We analyze their phase diagrams, associated phase transitions and related algorithmic implications (e.g. hard phases and new algorithms). On the way to our main goal of building theory of deep learning we encounter many fascinating problems of current statistics, machine learning, data and network science to which our approach contributes. We also pursue mathematically rigorous establishment of the methodology. The project is firmly based in statistical physics but flies towards various topics in computer science, signal processing, complexity theory, information theory, machine learning, combinatorics etc. We are looking for candidates with one of the following backgrounds (or a combination of the two) to join the team and work on one or more of many sub-problems related to the project.
(1) Strong background (PhD or equivalent) in statistical physics of disordered systems such as glasses, spin glasses, or interdisciplinary applications. Experience and interest in both analytical (such as the replica and the cavity method) and numerical techniques (message passing, Monte Carlo). Coding lovers with interest in computer related issues and/or machine learning particularly welcome.
(2) Strong background (PhD or equivalent) in fields related to machine learning, information theory, signal processing, data processing, computer science, statistics with a strong interest to learn more about methods from statistical mechanics that can be used to treat (albeit sometimes non-rigorously, so far ...) some problems considered as intractable in the before mentioned fields.
We offer a two year postdoctoral contract within the French CNRS, with the standard CNRS salary and benefits (full healthcare coverage for postdoc and his/her dependents, generous vacations, 16-weeks fully paid maternity leaves, free schooling from age 3 etc.). The group is based in Institute of Theoretical Physics (IPhT) in CEA Saclay (about 20 km south of Paris, well connected by frequent commuter train and buses). IPhT is one of the best and largest laboratories of theoretical physics in Europe. The group currently has the PI, 2 PhD students and one postdocs and is about to grow. We work in close collaboration with Florent Krzakala (ENS Paris) and his group (we have a joint working group, a journal club, and the seminar series Golosino) and with a number of other colleagues in the Parisian area and around the world. The position will start in September 2017 (or slightly later if justified). Interested applicants are invited to send their questions, CV and a statement of motivation and interest in the SMiLe project to the PI Lenka Zdeborová. Candidates are expected to have read some of my recent publications to get an idea of the type of work that is expected. Applications are receivable till November 30, 2016.
Contact: Lenka Zdeborová (lenka.zdeborova@gmail.com), informal inquiries are welcome.
HyperNetworks - implementation -
Did I ever tell you GitXiv is the most awesomest page on the interweb ?....yes, I think I did. Here is one of its latest entry with the link to the code/implementation and to the author's blog that talks about what he did and much much more. Go read the entry, I'll wait.
HyperNetworks by David Ha, Andrew Dai, Quoc V. Le
This work explores hypernetworks: an approach of using a small network, also known as a hypernetwork, to generate the weights for a larger network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a genotype - the hypernetwork - and a phenotype - the main network. Though they are also reminiscent of HyperNEAT in evolution, our hypernetworks are trained end-to-end with backpropagation and thus are usually faster. The focus of this work is to make hypernetworks useful for deep convolutional networks and long recurrent networks, where hypernetworks can be viewed as relaxed form of weight-sharing across layers. Our main result is that hypernetworks can generate non-shared weights for LSTM and achieve state-of-art results on a variety of language modeling tasks with Character-Level Penn Treebank and Hutter Prize Wikipedia datasets, challenging the weight-sharing paradigm for recurrent networks. Our results also show that hypernetworks applied to convolutional networks still achieve respectable results for image recognition tasks compared to state-of-the-art baseline models while requiring fewer learnable parameters.
Subscribe to:
Posts (Atom)