Review Articles

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen ,

University of Oklahoma, Oklahoma City, OK, USA

Jae kwang Kim

Iowa State University, Ames, IA, USA

Pages 69-81 | Received 07 Mar. 2017, Accepted 05 May. 2017, Published online: 01 Jun. 2021,
  • Abstract
  • Full Article
  • References
  • Citations


The empirical likelihood method is a powerful tool for incorporating moment conditions in statistical inference. We propose a novel application of the empirical likelihood for handling item non-response in survey sampling. The proposed method takes the form of fractional imputation but it does not require parametric model assumptions. Instead, only the first moment condition based on a regression model is assumed and the empirical likelihood method is applied to the observed residuals to get the fractional weights. The resulting semiparametric fractional imputation provides -consistent estimates for various parameters. Variance estimation is implemented using a jackknife method. Two limited simulation studies are presented to compare several imputation estimators.


  1. Chauvet, G., Deville, J. C., & Haziza, D. (2011). On balanced random imputation in surveys. Biometrika, 98, 459471. [Google Scholar]
  2. Chen, J., & Sitter, R. (1999). A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385406. [Google Scholar]
  3. Durrant, G. B., & Skinner, C. (2006). Using missing data methods to correct for measurement error in a distribution function. Survey Methodology, 32(1), 2536. [Google Scholar]
  4. Fay, R. E. (1992). When are inferences from multiple imputation valid? In Proceedings of the Survey Research Methods Section of the American Statistical Association (Vol. 81, pp. 227232). [Google Scholar]
  5. Fay, R. E. (1996). Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association, 91(434), 490498. [Taylor & Francis Online][Google Scholar]
  6. Fuller, W. A. (2009). Sampling statistics. Hoboken, NJ: Wiley. [Google Scholar]
  7. Fuller, W. A., & Kim, J. K. (2005). Hot deck imputation for the response model. Survey Methodology, 31, 139149. [Google Scholar]
  8. Haziza, D. (2009). Imputation and inference in the presence of missing data. In D. Pfeffermann & C. R. Rao (Eds.), Handbook of statistics. Sample surveys: Theory, methods and inference (Vol. 29, pp. 215246). Amsterdam: Elsevier BV. [Google Scholar]
  9. Isaki, C. T. and Fuller, W. A. (1982). Survey design under the regression superpopulation model. Journal of the American Statistical Association, 77, 8996. [Google Scholar]
  10. Kalton, G., & Kish, L. (1984). Some efficient random imputation methods. Communications in Statistics A, 13, 19191939. [Taylor & Francis Online][Google Scholar]
  11. Kim, J. K. (2011). Parametric fractional imputation for missing data analysis. Biometrika, 98, 119132. [Google Scholar]
  12. Kim, J. K., Brick, J., Fuller, W. A., & Kalton, G. (2006). On the bias of the multiple-imputation variance estimator in survey sampling. Journal of Royal Statistical Society: Series B, 68(3), 509521. [Google Scholar]
  13. Kim, J. K., & Fuller, W. A. (2004). Fractional hot deck imputation. Biometrika, 91(3), 559578. [Google Scholar]
  14. Kim, J. K., Navarro, A., & Fuller, W. A. (2006). Replicate variance estimation after multi-phase stratified sampling. Journal of the American Statistical Association, 101, 312320. [Taylor & Francis Online][Google Scholar]
  15. Kim, J. K., & Shao, J. (2013). Statistical methods for handling incomplete data. London: Chapman and Hall/CRC. [Crossref][Google Scholar]
  16. Kim, J. K., & Yang, S. (2014). Fractional hot deck imputation for robust inference under item nonresponse in survey sampling. Survey Methodology, 40, 211230. [Google Scholar]
  17. Kim, J. K., & Yu, C. L. (2011). A semi-parametric estimation of mean functionals with non-ignorable missing data. Journal of the American Statistical Association, 106, 157165. [Taylor & Francis Online][Google Scholar]
  18. Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley. [Google Scholar]
  19. Matloff, N. S. (1981). Use of regression functions for improved estimation of means. Biometrika, 68, 685689. [Google Scholar]
  20. McCullagh, P., & Nelder, J. (1989). Generalized linear models. London: Chapman and Hall. [Google Scholar]
  21. Meng, X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science, 9, 538558. [Crossref][Google Scholar]
  22. Müller, U. U. (2009). Estimating linear functionals in nonlinear regression with response missing at random. Annals of Statistics, 98, 22452277. [Google Scholar]
  23. Owen, A. B. (2001). Empirical likelihood. New York, NY: Chapman and Hall/CRC. [Google Scholar]
  24. Qin, J. (1993). Empirical likelihood in biased sample problems. Annals of Statistics, 21(3), 11821196. [Google Scholar]
  25. Qin, J., & Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300325. [Google Scholar]
  26. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons. [Google Scholar]
  27. Shao, J., & Steel, P. (1999). Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association, 94, 254265. [Taylor & Francis Online][Google Scholar]
  28. Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer-Verlag. [Google Scholar]
  29. Vardi, Y. (1985). Empirical distributions in selection bias models. Annals of Statistics, 13, 178203. [Google Scholar]
  30. Van der Vaart, A. W. (1998). Asymptotic statistics. New York: Cambridge University Press. [Google Scholar]
  31. Vís̆ek, J. A. (1979). Asymptotic distribution of simple estimate for rejective, Sampford and successive sampling. In J. Jurecková (Ed.), Contributions to statistics: Jaroslav hj́ek memorial volume (pp. 263275). Dordrecht: Academia, Prague & D. Reidel. [Google Scholar]
  32. Wang, D., & Chen, S. X. (2009). Empirical likelihood for estimating equations with missing values. The Annals of Statistics, 37, 490517. [Google Scholar]
  33. Wang, Q., & Rao, J. N. K. (2002). Empirical likelihood-based inference under imputation for missing response data. The Annals of Statistics, 30, 896924. [Google Scholar]
  34. Wang, N., & Robins, J. M. (1998). Large-sample theory for parametric multiple imputation procedures. Biometrika, 85(4), 935948. [Google Scholar]
  35. Wang, J. Q., & Opsomer, J. D. (2011). On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika, 98, 91106. [Google Scholar]
  36. Wolter, K. M. (2007). Introduction to variance estimation. New York, NY: Wiley. [Google Scholar]
  37. Wu, C., & Rao, J. N. K. (2006). Pseudo empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics, 34, 359375. [Google Scholar]
  38. Yang, S., & Kim, J. K. (2016). A note on multiple imputation for general-purpose estimation. Biometrika, 103, 244251. [Google Scholar]

Niansheng Tang, Yuanyuan Ju. (2018) Statistical inference for nonignorable missing-data problems: a selective review. Statistical Theory and Related Fields 2:2, pages 105-133.