Speakers

Xihong Lin, Harvard University

BiographyXihong Lin, PhD (Chinese: 林希虹) is Professor and former Chair of the Department of Biostatistics, Coordinating Director of the Program in Quantitative Genomics at the Harvard T. H. Chan School of Public Health, and Professor of the Department of Statistics at the Faculty of Arts and Sciences of Harvard University, and Associate Member of the Broad Institute of MIT and Harvard. Dr. Lin’s research interests lie in development and application of scalable statistical and machine learning methods for analysis of massive high-throughput data from genome, exposome and phenome, as well as complex epidemiological, biobank and health data.  Dr. Lin received the MERIT Award (R37) (2007-2015) and the Outstanding Investigator Award (OIA) (R35) (2015-2022) from the National Cancer Institute (NCI). She is the contact PI of the Harvard Analysis Center of the NHGRI Genome Sequencing Program.  Dr. Lin is an elected member of the National Academy of Medicine. She has received several prestigious awards including the 2002 Mortimer Spiegelman Award from the American Public Health Association, and the 2006 Presidents’ Award of the Committee of Presidents of Statistical Societies (COPSS). She is an elected fellow of American Statistical Association, Institute of Mathematical Statistics, and International Statistical Institute. Dr. Lin is the former Chair of the COPSS (2010-2012) and a former member of the Committee of Applied and Theoretical Statistics of the National Academy of Science. She is the founding chair of the US Biostatistics Department Chair Group, and the founding co-chair of the Young Researcher Workshop of East-North American Region (ENAR) of International Biometric Society. She is the former Coordinating Editor of Biometrics and the founding co-editor of Statistics in Biosciences.  She has served on a large number of committees of many statistical societies, and numerous NIH and NSF review panels.

 

Title: Hypothesis Testing for a Large Number of Composite Nulls in Genome-wide Causal Mediation Analysis

Abstract: In genome-wide epigenetic studies, it is often of scientific interest to assess whether the effect of an exposure on a clinical outcome is mediated through DNA methylation. Statistical inference for causal mediation effects is challenged by the fact that one needs to test a large number of composite null hypotheses across the genome. In this paper, we first study the theoretical properties of the commonly used methods for testing for causal mediation effects, Sobel's test and the joint significance test. We show the joint significance test is the likelihood ratio test for the composite null hypothesis of no mediation effect. Both Sobel's test and the joint significance test follow non-standard distributions, and they are overly conservative for testing mediation effects and yield invalid inference in genome-wide epigenetic studies.  We propose a novel Divide-Aggregate Composite-null Test (DACT) for the composite null hypothesis of no mediation effect in genome-wide analysis.  We show that the DACT method provides valid statistical inference and boosts power for testing mediation effects across the genome.   We propose a correction procedure to improve the DACT method using Efron's empirical null method when the exposure-mediator or/and the mediator-outcome association signals are not sparse.   Our extensive simulation studies show that the DACT method properly controls type I error rates and outperforms the Sobel's and the joint significance tests for genome-wide causal mediation analysis. We applied the DACT method to the Normative Aging Study to identify putative DNA methylation sites that mediate the effect of smoking on lung function. We also developed a computationally efficient R package DACT for public use.