investigator_user investigator user funding collaborators pending menu bell message arrow_up arrow_down filter layers globe marker add arrow close download edit facebook info linkedin minus plus save share search sort twitter remove user-plus user-minus
  • Project leads
  • Collaborators

Using Functional Data to Reveal Gene-Environment Interaction in Colorectal Cancer

Li Hsu

0 Collaborator(s)

Funding source

National Institutes of Health (NIH)
Colorectal cancer (CRC) is the third most common cancer and the second-leading cause of cancer death in the United States. Both genetic (G) and environmental (E) factors play important roles in CRC. It is thus important to study the interplay between G and E to better understand the etiology of this complex disease. The recent availability of genome-wide genotype data and advances in statistical methods has enabled agnostic genome-wide searches for gene-environment interaction (GxE), which have identified several novel interactions. Despite these successes, limited statistical power remains a primary concern in GxE analysis as the sample size required to detect interactions is at least 4x that required to detect main effects of similar magnitude. This limitation is particularly relevant following stringent correction for multiple tests in genome- wide GxE analysis. Further, despite the potential importance of rare variants in CRC, existing GxE studies focus on common variants. We thus propose to use functional data to inform GxE testing for both common and rare variants across the genome. We will apply novel statistical methods to aggregate interaction signals among a set of G's or a set of E's to increase power. We will leverage the existing resources in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colon Cancer Family Registry, in which genetic and well harmonized environmental data are available for close to 40,000 CRC cases and controls. In addition, we are currently building a CRC-specific functional annotation database based on >80 annotation datasets across public databases such as UCSC Genome Browser, ENCODE, NIH Roadmap, GTEx, and TCGA. In Aim 1, we will examine whether common variants (MAF>1%) predicted in silico to have functional importance modify the effect of environmental risk factors on CRC risk. By prioritizing functional candidates for GxE testing, we will greatly reduce the multiple testing burden that arises from testing millions of SNPs across the genome. In Aim 2, we will perform aggregate association testing to examine interactions between rare variants (MAF<1%) and environmental risk factors for CRC. Functional information will be used to give greater weights to biologically important variants when aggregating interaction signals. In Aim 3, we will test GxE for an aggregated set of environmental variables that capture different components of a single environmental risk factor (e.g., for smoking the components include current/ever/never use, dose, and duration). As these may influence CRC in distinct ways, we will aggregate the interaction signals across components, which will increase power to detect GxE. Overall, the proposed study provides a unique opportunity to detect novel GxE findings for CRC risk-particularly for functionally important or rare variants across the genome. We expect that our findings will help provide a better understanding of the interplay between genetic and environmental factors in CRC development. By identifying carcinogenic mechanisms and, in turn, potential targets for future therapies, these insights can help improve current prevention and treatment strategies for CRC.

Related projects