Friday, January 11, 2013

It's not a bad reconstruction, just the end of an illusion...

If there is something one learns with compressive sensing it is that your assumptions on the unknown is central to how your signal will be reconstructed. For 200 years, we had to minimized energy, since 2004, we are looking for the sparsest signal and we may eventually be able to look for additional structure [3,4, 5]. Here are two examples I gleaned over a few weeks where the current assumptions are probably not the good ones. Can compressive sensing help ? 

In the Genetics of Parkinson's Disease: I asked a question as to why GWAS studies on Parkinson's did not pick up on the GBA gene ? a review paper that aimed to answer specifically that question was sent to me by one of its author and here is what they said back in 2008: 

....The identification and recognition of this strong association between glucocerebrosidase and PD raises many questions. Why did epidemiologic studies of PD miss this association? Why did genetic linkage studies not pick out the glucocerebrosidase locus on chromosome 1q21 as a Parkinson susceptibility region? Why did the whole genome association studies not identify the locus? Epidemiologists probably missed the association because Gaucher disease is much rarer than PD and the clinical phenotype is usually so different from parkinsonism that it was never considered. Genetic linkage studies would have struggled to identify the locus because of the rare nature of glucocerebrosidase mutations in most datasets with the exception of the Ashkenazi population. Furthermore, none of the mutations seem to be fully penetrant and so they will show only weak evidence for segregation in families with PD. Finally, whole genome association studies apply an overly strict correction for multiple testing and rely on the assumption, incorrect in the case of glucocerebrosidase, that there is a single disease-associated allele at each locus. The existence of the multiple disease-associated allelic variants in the gene candidate could be a general phenomenon for other neurodegenerative disorders. Such heterogeneity has important implications for replication studies that would need to assess a battery of variations in the gene of interest using datasets with homogeneous genetic background. Hence, the glucocerebrosidase example is an illustration of how an important genetic risk factor for a complex disease can evade detection by systematic analysis: it only came onto the radar because of astute clinical observations. 3

If I understand correctly, an algorithm that looks for a certain type of variant would bin that variant in one category. But overall, every individual category found through this means would not match the disease phenotype. The matching would not be explanatory. In this case, while every variant is very unique, the larger set of all the variants (group of category) could be matched globally to the disease phenotype. The reason the disease has several forms (different phenotypes) is probably because the variants are acting differently in the diverse biochemical networks [2]. And then there is the curious case of Autism or the sparse set of signaling pathways for Cancer [6]

The second misunderstood assumption is explained in The Effects of Connection Reconstruction Method on the Interregional Connectivity of Brain Networks via Diffusion Tractography by Longchuan Li, James K. Rilling, Todd M. Preuss, Matthew F. Glasser, and Xiaoping Hu. The abstract reads:

Estimating the interregional structural connections of the brain via diffusion tractography is a complex procedure and the parameters chosen can affect the outcome of the connectivity matrix. Here, we investigated the influence of different connection reconstruction methods on brain connectivity networks. Specifically, we applied three connection reconstruction methods to the same set of diffusion MRI data, initiating tracking from deep white matter (method #1, M1), from the gray matter/white matter interface (M2), and from the gray/white matter interface with thresholded tract volume rather than the connection probability as the connectivity index (M3). Small-world properties, hub identification, and hemispheric asymmetry in connectivity patterns were then calculated and compared across methods. Despite moderate to high correlations in the graph-theoretic measures across different methods, significant differences were observed in small-world indices, identified hubs, and hemispheric asymmetries, highlighting the importance of reconstruction method on network parameters. Consistent with the prior reports, the left precuneus was identified as a hub region in all three methods, suggesting it has a prominent role in brain networks

As Matt says, "And, there are certainly unknown unknowns beyond that."

[1] Gaucher and Parkinson diseases: Unexpectedly related, Ekaterina Rogaeva, John Hardy

No comments: