In this Sunday Morning Insight: What Happens When You Cross into P Territory ?, I mentioned this article on using LSH for genome alignment from long read technology (PacBio RS II or Oxford Nanopore MiNion).

Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing by Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James Drake, Jane M Landolin, Adam M Phillippy

while assembling the genome is important, with cheap and fast long reads, the goalpost is now slowly moving to the unsupervised learning of groups of genomes. That type of unsupervised learning can only be enabled with the right dimensionality reduction technique, today it is MinHash

```
Many exciting applications for Mash (ANI, metagenome clustering, nanopore triage, ...) Preprint coming soon https://t.co/xXYtsjEuJb
— Adam Phillippy (@aphillippy) August 25, 2015
```

Here is Mash: Fast genome distance estimation using the MinHash algorithm from Adam Phillippy's group.

Wonder how MinHash works, check this write-up by Matthew Casperson on MinHash for dummies.

