Yukun Liu (刘玉坤) is currently a professor at the School of Statistics, East China Normal University (ECNU). Before joining ECNU, he got his BSc and PhD degrees from Nankai University in 2003 and 2009, respectively. His research interests include empirical likelihood and semiparametric statistical theory and their applications in missing data, causal inference, epidemiology, ecology etc.
Title: Some progress on hypothesis testing and parameter estimation with non-ignorable missing data
Abstract: Missing data are frequently encountered in various disciplines and can be divided into three categories: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). Ignorable missing (MCAR and MAR) data are relatively easy to handle as all model parameters are generally identifiable. However, MNAR data analyses are much more challenging because of the issue of model non-identifiability. Valid statistical approaches to missing data depend crucially on correct identification of the underlying missingness mechanism.
This talk consists of two parts. In the first part, I will present two score tests for testing whether the missingness mechanism is MAR or MNAR. The implementation of the score tests circumvents the identification issue as they requires only parameter estimation under the null MAR assumption. Also they are shown to have certain optimality, well-controlled type I errors and desirable powers. In the second part, we introduce a likelihood-based estimation procedure under a logistic nonmissingness model and a semiparametric regression model. The propose procedure circumvents the use of inverse probability weighting (IPW) and overcome the instability and mulitple-roots issues of commonly-used IPW methods for MNAR data analyses. A real data example is analyzed to illustrate the advantages of our method. We conclude that if only a parametric logistic model is assumed but the outcome regression model is left arbitrary, then one has to be cautious in using any of the existing statistical methods in problems involving MNAR data.