Friday, April 19, 2013

Bayesian methods for gene expression factor analysis - implementation -

Here is a new Matrix Factorization technique (which shall be shortly added to the Advanced Matrix Factorization Jungle page), from the page:

Bayesian methods for gene expression factor analysis


In the past few years, genomics has received growing interest, particularly since the map of the human genome was completed in early 2000's. Currently, medical teams are facing a new challenge: processing the signals issued by DNA chips. These signals, often of voluminous size, allow one to discover the level of a gene expression in a given tissue at any time. For example, to detect or prevent a disease in a group of observed patients based on gene expression levels innovative methods are needed to analyze the information provided by these chips.
One important problem is the identification of the specific temporal gene expression profiles of host response to a pathogen. In this problem, sensitive methods are needed that can reveal subtle and elusive expression patterns. One particularly promising approach are dimensionality reduction methods that decompose these signals into elementary patterns according to a linear mixing model. For example, PCA, ICA and manifold learning are such approaches. We propose to develop Bayesian decomposition algorithms to identify relevant biomarkers and estimate their levels of expression. The work is based on the experience of the SC team on Bayesian statistical modeling, as well as the use of stochastic simulation methods, which have already been demonstrated for other applications (including hyperspectral imaging). Moreover, when analyzing DNA data, a crucial issue is the reduction of dimensionality of the signals to be processed. Consequently, a particular interest is devoted to developing appropriate methods for sparse analysis and variable selection.
This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coeffi cients, called factor scores, that specify the relative contribution of each signature to a speci c sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters.

Exposure to influenza viruses is necessary, but not sufficient, for healthy human hosts to develop symptomatic illness. The host response is an important determinant of disease progression. In order to delineate host molecular responses that differentiate symptomatic and asymptomatic Influenza A infection, we inoculated 17 healthy adults with live influenza (H3N2/Wisconsin) and examined changes in host peripheral blood gene expression at 16 timepoints over 132 hours. Here we present distinct transcriptional dynamics of host responses unique to asymptomatic and symptomatic infections. We show that symptomatic hosts invoke, simultaneously, multiple pattern recognition receptors-mediated antiviral and inflammatory response that may relate to virus-induced oxidative stress. In contrast, asymptomatic subjects tightly regulate these responses and exhibit elevated expression of genes that function in antioxidant responses and cell-mediated responses. We identify biomarkers whose expression patterns discriminate early from late phases of infection and stratify the risk of developing post-infection symptoms. Our results establish a temporal pattern of host molecular responses that differentiates symptomatic from asymptomatic infections and reveals an asymptomatic host-unique non-passive response signature, suggesting novel putative molecular targets both for prognostic assessment and ameliorative therapeutic intervention in seasonal and pandemic influenza. 
The implementation is located here.

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments: