Wednesday, September 21, 2016

Paris Machine Learning Meetup, Hors série #2: Scalable Machine Learning with H2O


Tonight, we will have Paris Machine Learning Meetup, Hors série #2 on Scalable Machine Learning with H2O. Jiqiong Qiu is a co-organizer of this meetup with Franck Bardol and Igor Carron. This "Hors Série 2" will be focused on a few presentations of H2O by Jo-fai Chow and Jakub Háva. The meetup will be hosted and sponsored by Murex.

If you want to try the H2o, feel free to follow this installation guide. We provide not only the stand alone installation guide but also a dockerfile with H2o+Tensorflow+Spark.

All the presentations are at:

1. Introduction to Machine Learning with H2O (Joe - 30 mins)
In this talk, I will walk you through our company (, our open-source machine learning platform (H2O) and use cases from some of our users. This will be useful for attendees who are not familiar with H2O.

2. Project “Deep Water” (H2O integration with TensorFlow and other deep learning libraries) (Joe - 30 mins)
The “Deep Water" project is about integrating our H2O platform with other open-source deep learning libraries such as TensorFlow from Google. We are also working GPU implementation for H2O.In this talk about the motivation and potential benefits of our recent project named “Deep Water”. 

3. Sparkling Water 2.0 (Jakub - 45 mins)
Sparkling Water integrates the H2O open source distributed machine learning platform with the capabilities of Apache Spark. It allows users to leverage H2O’s machine learning algorithms with Apache Spark applications via Scala, Python, R or H2O’s Flow GUI which makes Sparkling Water a great enterprise solution. Sparkling Water 2.0 was built to coincide with the release of Apache Spark 2.0 and introduces several new features. These include the ability to use H2O frames as Apache Spark’s SQL datasource, transparent integration into Apache Spark machine learning pipelines, the power to use Apache Spark algorithms via the Flow GUI and easier deployment of Sparkling Water in a Python environment. In this talk we will introduce the basic architecture of Sparkling Water and provide an overview of the new features available in Sparkling Water 2.0. The talk will also include a live demo showing how to integrate H2O algorithms into Apache Spark pipelines – no terminal needed!

About Joe : 
Jo-fai (or Joe) is a data scientist at Before joining H2O, he was in the business intelligence team at Virgin Media where he developed data products to enable quick and smart business decisions. He also worked (part-time) for Domino Data Lab as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD researcher at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in UK and abroad. He also holds a MSc in Environmental Management and a BEng in Civil Engineering.

About Jakub :
Jakub (or “Kuba”) finished his bachelors degree in computer science at Charles University in Prague, and is currently finishing his master’s in software engineering as well. As a bachelors thesis, Kuba wrote a small platform for distributed computing of tasks of any type. On his current masters studies he’s developing a cluster monitoring tool for JVM based languages which should make debugging and reasoning about performance of distributed systems easier using a concept called distributed stack traces. At H2O, Kuba mostly works on Sparkling Water project.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: