Goffard N, Deharo S, Walker SM, O’Donnell J, Keating K, McDyer F, Proutski V, Kennedy RD.
In: Quantitative Biology and Bioinformatics in Modern Medicine. Dublin (Ireland); 2011
Large-scale gene expression profiling provides a powerful resource to identify cancer subtypes relevant for prognosis and/or therapy responses. To investigate the underlying molecular classes of tumors, a computational workflow has been developed for the automated integration of clinical information with functional genomics data.
Based on semi-supervised methods, this approach includes three main modules: i) the hierarchical clustering of gene expression data with bootstrap sampling, determination of optimum number of clusters and visualization of the results, ii) the statistical analysis of the associations between the observed subtypes and clinical parameters and iii) a functional enrichment analysis to determine the biological relevance of these groups. It has been validated by analyzing prostate cancer gene expression data in association with clinical parameters. It has thereby been possible to automatically rediscover previously described tumor subtypes. In addition, it provides evidence of potential novel subtype.
This integrated workflow, implemented in Perl and R/Bioconductor statistical packages, can be systematically applied to a wide range of datasets. It hence provides a useful resource for the automated analysis of cancer subtypes based on gene expression profiles.
Tags: Cancer, Gene expression