Review Articles

Semiparametric fractional imputation using empirical likelihood in survey sampling

Sixia Chen ,

University of Oklahoma, Oklahoma City, OK, USA

Jae kwang Kim

Iowa State University, Ames, IA, USA

Pages 69-81 | Received 07 Mar. 2017, Accepted 05 May. 2017, Published online: 01 Jun. 2021,
  • Abstract
  • Full Article
  • References
  • Citations

ABSTRACT

The empirical likelihood method is a powerful tool for incorporating moment conditions in statistical inference. We propose a novel application of the empirical likelihood for handling item non-response in survey sampling. The proposed method takes the form of fractional imputation but it does not require parametric model assumptions. Instead, only the first moment condition based on a regression model is assumed and the empirical likelihood method is applied to the observed residuals to get the fractional weights. The resulting semiparametric fractional imputation provides -consistent estimates for various parameters. Variance estimation is implemented using a jackknife method. Two limited simulation studies are presented to compare several imputation estimators.

Your browser may not support PDF viewing. Please click to download the file.

References

  1. Chauvet, G., Deville, J. C., & Haziza, D. (2011). On balanced random imputation in surveys. Biometrika, 98, 459471. [Google Scholar]
  2. Chen, J., & Sitter, R. (1999). A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385406. [Google Scholar]
  3. Durrant, G. B., & Skinner, C. (2006). Using missing data methods to correct for measurement error in a distribution function. Survey Methodology, 32(1), 2536. [Google Scholar]
  4. Fay, R. E. (1992). When are inferences from multiple imputation valid? In Proceedings of the Survey Research Methods Section of the American Statistical Association (Vol. 81, pp. 227232). [Google Scholar]
  5. Fay, R. E. (1996). Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association, 91(434), 490498. [Taylor & Francis Online][Google Scholar]
  6. Fuller, W. A. (2009). Sampling statistics. Hoboken, NJ: Wiley. [Google Scholar]
  7. Fuller, W. A., & Kim, J. K. (2005). Hot deck imputation for the response model. Survey Methodology, 31, 139149. [Google Scholar]
  8. Haziza, D. (2009). Imputation and inference in the presence of missing data. In D. Pfeffermann & C. R. Rao (Eds.), Handbook of statistics. Sample surveys: Theory, methods and inference (Vol. 29, pp. 215246). Amsterdam: Elsevier BV. [Google Scholar]
  9. Isaki, C. T. and Fuller, W. A. (1982). Survey design under the regression superpopulation model. Journal of the American Statistical Association, 77, 8996. [Google Scholar]
  10. Kalton, G., & Kish, L. (1984). Some efficient random imputation methods. Communications in Statistics A, 13, 19191939. [Taylor & Francis Online][Google Scholar]
  11. Kim, J. K. (2011). Parametric fractional imputation for missing data analysis. Biometrika, 98, 119132. [Google Scholar]
  12. Kim, J. K., Brick, J., Fuller, W. A., & Kalton, G. (2006). On the bias of the multiple-imputation variance estimator in survey sampling. Journal of Royal Statistical Society: Series B, 68(3), 509521. [Google Scholar]
  13. Kim, J. K., & Fuller, W. A. (2004). Fractional hot deck imputation. Biometrika, 91(3), 559578. [Google Scholar]
  14. Kim, J. K., Navarro, A., & Fuller, W. A. (2006). Replicate variance estimation after multi-phase stratified sampling. Journal of the American Statistical Association, 101, 312320. [Taylor & Francis Online][Google Scholar]
  15. Kim, J. K., & Shao, J. (2013). Statistical methods for handling incomplete data. London: Chapman and Hall/CRC. [Crossref][Google Scholar]
  16. Kim, J. K., & Yang, S. (2014). Fractional hot deck imputation for robust inference under item nonresponse in survey sampling. Survey Methodology, 40, 211230. [Google Scholar]
  17. Kim, J. K., & Yu, C. L. (2011). A semi-parametric estimation of mean functionals with non-ignorable missing data. Journal of the American Statistical Association, 106, 157165. [Taylor & Francis Online][Google Scholar]
  18. Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley. [Google Scholar]
  19. Matloff, N. S. (1981). Use of regression functions for improved estimation of means. Biometrika, 68, 685689. [Google Scholar]
  20. McCullagh, P., & Nelder, J. (1989). Generalized linear models. London: Chapman and Hall. [Google Scholar]
  21. Meng, X. L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science, 9, 538558. [Crossref][Google Scholar]
  22. Müller, U. U. (2009). Estimating linear functionals in nonlinear regression with response missing at random. Annals of Statistics, 98, 22452277. [Google Scholar]
  23. Owen, A. B. (2001). Empirical likelihood. New York, NY: Chapman and Hall/CRC. [Google Scholar]
  24. Qin, J. (1993). Empirical likelihood in biased sample problems. Annals of Statistics, 21(3), 11821196. [Google Scholar]
  25. Qin, J., & Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300325. [Google Scholar]
  26. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons. [Google Scholar]
  27. Shao, J., & Steel, P. (1999). Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association, 94, 254265. [Taylor & Francis Online][Google Scholar]
  28. Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer-Verlag. [Google Scholar]
  29. Vardi, Y. (1985). Empirical distributions in selection bias models. Annals of Statistics, 13, 178203. [Google Scholar]
  30. Van der Vaart, A. W. (1998). Asymptotic statistics. New York: Cambridge University Press. [Google Scholar]
  31. Vís̆ek, J. A. (1979). Asymptotic distribution of simple estimate for rejective, Sampford and successive sampling. In J. Jurecková (Ed.), Contributions to statistics: Jaroslav hj́ek memorial volume (pp. 263275). Dordrecht: Academia, Prague & D. Reidel. [Google Scholar]
  32. Wang, D., & Chen, S. X. (2009). Empirical likelihood for estimating equations with missing values. The Annals of Statistics, 37, 490517. [Google Scholar]
  33. Wang, Q., & Rao, J. N. K. (2002). Empirical likelihood-based inference under imputation for missing response data. The Annals of Statistics, 30, 896924. [Google Scholar]
  34. Wang, N., & Robins, J. M. (1998). Large-sample theory for parametric multiple imputation procedures. Biometrika, 85(4), 935948. [Google Scholar]
  35. Wang, J. Q., & Opsomer, J. D. (2011). On asymptotic normality and variance estimation for nondifferentiable survey estimators. Biometrika, 98, 91106. [Google Scholar]
  36. Wolter, K. M. (2007). Introduction to variance estimation. New York, NY: Wiley. [Google Scholar]
  37. Wu, C., & Rao, J. N. K. (2006). Pseudo empirical likelihood ratio confidence intervals for complex surveys. The Canadian Journal of Statistics, 34, 359375. [Google Scholar]
  38. Yang, S., & Kim, J. K. (2016). A note on multiple imputation for general-purpose estimation. Biometrika, 103, 244251. [Google Scholar]

Niansheng Tang, Yuanyuan Ju. (2018) Statistical inference for nonignorable missing-data problems: a selective review. Statistical Theory and Related Fields 2:2, pages 105-133.