The knowledge generated by the genome only makes sense if one can do a good job at figuring out what sort of proteins are producing by different elements of the DNA. Hence, while GWAS studies are great and the alignement issues in sequencing is becoming easier to handle, the big unknown is now to connect information with the zoo of proteins produced by the body. That effort includes sensing the right thing in a very large dimensional space, the subject of today's paper. Let us hope it guides us into producing better sensors . And yes as the article says
Without further ado Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data by Tim Conrad, Martin Genzel, Nada Cvetkovic, Niklas Wulkow, Alexander Leichtle, Jan Vybiral, Gitta Kutyniok, Christof Schütte
"a [Machine Learning] classification problem is equivalent to 1-bit CS "
Without further ado Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data by Tim Conrad, Martin Genzel, Nada Cvetkovic, Niklas Wulkow, Alexander Leichtle, Jan Vybiral, Gitta Kutyniok, Christof Schütte
Motivation: High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested how MS spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease. Machine learning algorithms are needed to (a) identify these discriminating features and (b) classify unknown spectra based on this feature set. Since the acquired data is usually noisy, the algorithms should be robust to noise and outliers, and the identified feature set should be as small as possible.
Results: We present a new algorithm, Sparse Proteomics Analysis (SPA), based on the theory of Compressed Sensing that allows to identify a minimal discriminating set of features from mass spectrometry data-sets. We show how our method performs on artificial and real-world data-sets.
The attendant code is here: http://medicalbioinformatics.de/index.php?option=com_content&view=article&id=19
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
1 comment:
the code is not available - that fold only contains a readme file.
Post a Comment