P3-12: Integrative Association Studies of Complex Diseases in the Post GWAS Era

PhD Student: Summaira Yasmeen
Supervisors: Prof. Dr. Heike Bickeböller
Department: Genetic Epidemiology

Project Description:
In genetic epidemiology, genome-wide association studies (GWAS) have identified thousands of single nucleotide polymorphisms (SNPs) associated with complex diseases such as lung cancer or rheumatoid arthritis. Genome chips, sequence data, and other -omics data (transcriptome, methylome, metabolome) are on highly different technological and biological scales, both within and across studies. These other ?omics data will generally be closer to the phenotype when compared to genomic data.

Many current research developments are driven by the integration of high-dimensional genotyping or sequencing data with information derived from other data on the same samples or of sources external to the study samples to be analyzed. We will investigate the genes, biological pathways, or even whole genomic regions. In particular, we will consider networks and interactions, also across other -omics scales. As preparatory work, SNPs have to be assigned to genes and genes have to be assigned to the considered pathways. Then, all SNPs assigned to a gene or ultimately assigned to a network will be analyzed together. This greatly reduces the high dimensionality of GWAS. Moreover, this enhances their power by using biological network information. The goal is to integrate gene- and pathway-level information with previous knowledge or simultaneously across scales via kernels, GAMLSS, or other adjustable methods.

On the transcriptome and methylome scales, expression or methylation quantitative trait data (eQT, meQT) measuring the abundance of transcripts (for genes) or methylation (specific genome sites), are usually correlated with disease. They are repeatedly associated with SNPs, yielding so called eQTL or meQTL for a given locus. Combining such data directly with the SNP data may improve the power to detect causally relevant loci influencing, e.g. the development of cancer.

The most likely application in the development of statistical methods in this project is lung cancer. In this field, strong collaboration already exists with the "Transdisciplinary Research in Cancer of the Lung" (TRICL) Consortium and the International Lung Cancer Consortium (ILCCO). Data are available upon request across several GWASs worldwide. Thus, a particular focus of this project will be on considering networks and interactions using information on other -omics scales also in the context of the meta-analysis of these studies.