Nuit Blanche: Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinION (TM) sequencing

Wednesday, July 27, 2016

Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinION (TM) sequencing

About four years ago, I tried to predict the future for August 25, 2030. In order to do this, I first mentioned The Steamrollers i.e. technologies that were exponential in nature. Nanopore sequencing was one of them. In the second installment, I mentioned different algorithms that could help in making sense of the data generated by these steamrollers (Predicting the Future: Randomness and Parsimony). Streaming was one of them. It is really no surprise, if like in hyperspectral imaging or nanopore sequencing you are producing a lot of data, your interest switch from the modeling aspect of things to how can it be helpful now and how fast. How does it change science ? well you just need to read the following article:

The main contribution of this article is to demonstrate that despite the higher error rate, it is possible to return clinical actionable information, including species and strain identification from as few as 500 reads. We achieved this by developing novel approaches that are less sensitive to base-calling errors and which use whatever subset of genome-wide information is observed up to a point in time, rather than a panel of pre-defined markers or genes. For example, the strain typing presence/absence approach relies only on being able to identify homology to genes and also allows for a level of incorrect gene annotation.

Streaming algorithms for identification of pathogens and antibiotic resistance potential from real-time MinIONTMsequencing by Minh Duc Cao, Devika Ganesamoorthy, Alysha G. Elliott, Huihui Zhang, Matthew A. Cooper and Lachlan J.M. Coin

The recently introduced Oxford Nanopore MinION platform generates DNA sequence data in real-time. This has great potential to shorten the sample-to-results time and is likely to have benefits such as rapid diagnosis of bacterial infection and identification of drug resistance. However, there are few tools available for streaming analysis of real-time sequencing data. Here, we present a framework for streaming analysis of MinION real-time sequence data, together with probabilistic streaming algorithms for species typing, strain typing and antibiotic resistance profile identification. Using four culture isolate samples, as well as a mixed-species sample, we demonstrate that bacterial species and strain information can be obtained within 30 min of sequencing and using about 500 reads, initial drug-resistance profiles within two hours, and complete resistance profiles within 10 h. While strain identification with multi-locus sequence typing required more than 15x coverage to generate confident assignments, our novel gene-presence typing could detect the presence of a known strain with 0.5x coverage. We also show that our pipeline can process over 100 times more data than the current throughput of the MinION on a desktop computer.