This is a follow-up to the recent Sunday Morning Insight on Escaping Feynman's NP-Hard "Map of a Cat": Genomic Sequencing Edition
To recap, in compressive sensing, it's been known for a while that some solutions can be found thanks to l_1 (P or Polynomial time) relaxation of combinatorial problems (NP). In fact, the whole field of compressive sensing took off when people realized one could be on the P side most of the time.
In genome sequencing the latest long read technology have enabled the whole field to transport itself from an NP territory into one where polynomial-time algorithms (P) will do OK. The threshold to cross is about 2K. Here is what we can read from the PacBio technology
Today PacBio released sequence data & results using its new P6-C4 chemistry: >14k average read length in C. elegans: http://t.co/KETvKFZ1eH
— R. Taylor Raborn (@rtraborn) October 15, 2014
When you go in P territory, many things change, here is one:
@sergekoren reduced drosophila assembly from 600,000cpu hours down to ~1000cpu hours using MHAP. #pacbio #pacbioUGMand here is what people say about the Oxford Nanopore technology.
— Kevin Corcoran (@kc31958) October 16, 2014
Figure; From Model Selection with Many More Variables than Observations, presentation by Victoria Stodden
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.