Review Articles

Neyman smooth-type goodness-of-fit tests in complex surveys

Yan Lu ,

Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM, USA

yanlu@unm.edu

Lang Zhou ,

Kite Pharma Inc., A Gilead Company, Santa Monica, CA, USA

Guoyi Zhang ,

Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM, USA

Ronald Christensen

Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM, USA

Pages | Received 01 Feb. 2025, Accepted 11 Jan. 2026, Published online: 21 Jan. 2026,
  • Abstract
  • Full Article
  • References
  • Citations

In this study, we extend Neyman smooth-type goodness-of-fit tests to complex survey settings involving categorical data, by incorporating design-consistent estimators under the survey framework. This extension is implemented through data-driven, nonparametric order selection methods. We examine the asymptotic properties of the proposed estimators and demonstrate, through simulations, that our methods improve statistical power while maintaining strong control over Type I error, particularly in detecting subtle yet systematic differences across categories. We also illustrate the practical utility of our approach using data from the National Youth Tobacco Survey (NYTS).

Your browser may not support PDF viewing. Please click to download the file.

References

  • Arfken, G. (1985). Mathematical Methods for Physicists (3rd ed.). Academic Press.
  • Bedrick, E. J. (1983). Adjusted chi-squared tests for cross-classified tables of survey data. Biometrika70(3), 591–595. https://doi.org/10.1093/biomet/70.3.591
  • Eubank, R. L. (1997). Testing goodness of fit with multinomial data. Journal of the American Statistical Association92(439), 1084–1093. https://doi.org/10.1080/01621459.1997.10474064
  • Eubank, R. L. (1999). Nonparametric Regression and Spline Smoothing (2nd ed.). CRC Press.
  • Eubank, R. L., & Hart, J. D. (1992). Testing goodness of fit in regression via order selection criteria. The Annals of Statistics20(3), 1412–1425. https://doi.org/10.1214/aos/1176348775
  • Fay, R. E. (1979). On adjusting the Pearson chi-square statistic for clustered sampling. In Proceedings of the American Statistical Association, Social Statistics Section (pp. 402–406). American Statistical Association.
  • Fay, R. E. (1985). A jackknifed chi-squared test for complex samples. Journal of the American Statistical Association80(389), 370–375.
  • Isaki, C. T., & Fuller, W. A. (1982). Survey design under the regression superpopulation model. Journal of the American Statistical Association77(377), 89–96. https://doi.org/10.1080/01621459.1982.10477770
  • Jamil, H., Moustaki, I., & Skinner, C. J. (2025). Pairwise likelihood estimation and limited information goodness-of fit test statistics for binary factor analysis models under complex survey sampling. British Journal of Mathematical and Statistical Psychology78(1), 258–285. https://doi.org/10.1111/bmsp.v78.1
  • Kim, J., Rao, J. N. K., & Wang, Z. (2019). Hypotheses Testing from Complex Survey Data Using Bootstrap Weights: A Unified Approach (Technical Paper No. 265). Iowa State University.
  • Kish, L. (1965). Survey Sampling. John Wiley & Sons, Inc.
  • Lancaster, H. O. (n.d.). The Chi-squared Distribution. Wiley.
  • Lindsay, B. G. (1988). Composite likelihood methods. Statistical Inference from Stochastic Processes80, 221–239. https://doi.org/10.1090/conm/080
  • Lohr, S. L. (2021). Sampling: Design and Analysis (3rd ed.). Chapman and Hall/CRC.
  • Lu, Y. (2014). Chi-squared tests in dual frame surveys. Survey Methodology40(2), 323–334.
  • Muthén, B. O., & Satorra, A. (1995). Complex sample data in structural equation modeling. Sociological Methodology25, 267–316. https://doi.org/10.2307/271070
  • Neyman, J. (1937). Smooth test for goodness of fit. Skandinavisk Aktuarietidskrift20(3–4), 149–199.
  • Office on Smoking and Health (2014). 2014 National Youth Tobacco Survey: Methodology Report (Technical Paper). Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Office on Smoking and Health.
  • Rao, J. N. K., & Scott, A. J. (1981). The analysis of categorical data from complex sample surveys: Chi-squared tests for goodness of fit and independence in two-way tables. Journal of the American Statistical Association76(374), 221–230. https://doi.org/10.1080/01621459.1981.10477633
  • Rao, J. N. K., & Scott, A. J. (1984). On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data. The Annals of Statistics12(1), 46–60. https://doi.org/10.1214/aos/1176346391
  • Rao, J. N. K., & Scott, A. J. (1987). On simple adjustments to chi-square tests with sample survey data. The Annals of Statistics15(1), 385–397. https://doi.org/10.1214/aos/1176350273
  • Rayner, J. C. W., & Best, D. J. (1986). Neyman-type smooth tests for location-scale families. Biometrika73(2), 437–446. https://doi.org/10.1093/biomet/73.2.437
  • Rayner, J. C. W., & Best, D. J. (1989). Smooth Tests of Goodness of Fit. Oxford University Press.
  • Rayner, J. C. W., & Best, D. J. (1990). Smooth tests of goodness of fit: An overview. International Statistical Review58(1), 9–17. https://doi.org/10.2307/1403470
  • Rayner, J. C. W., Best, D. J., & Dodds, K. G. (1985). The construction of the simple x2 and Neyman smooth goodness of fit tests. Statistica Neerlandica39(1), 35–50. https://doi.org/10.1111/stan.1985.39.issue-1
  • Rayner, J. C. W., Thas, O., & Best, D. J. (2009). Smooth Tests of Goodness of Fit (2nd ed.). Wiley.
  • Sárndal, C.-E. (2003). Model Assisted Survey Sampling. Springer.
  • Skinner, C. J. (1989). Domain means, regression and multivariate analysis. In C. J. Skinner, D. Holt & T. M. F. Smith (Eds.), Analysis of complex surveys (pp. 59–75). Wiley.
  • Skinner, C. J. (2019). Analysis of categorical data for complex surveys. International Statistical Review87(S1), 64–78. https://doi.org/10.1111/insr.v87.S1
  • Skinner, C. J., & Rao, J. N. K. (1996). Estimation in dual frame surveys with complex designs. Journal of the American Statistical Association91(433), 349–356. https://doi.org/10.1080/01621459.1996.10476695
  • Varin, C. (2008). On composite marginal likelihoods. AStA Advances in Statistical Analysis92(1), 1–28. https://doi.org/10.1007/s10182-008-0060-7
  • Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society54(3), 426–482. https://doi.org/10.1090/tran/1943-054-03
  • Zhang, P. (1992). On the distributional properties of model selection criteria. Journal of the American Statistical Association87(418), 732–737. https://doi.org/10.1080/01621459.1992.10475275
  • Zhou, L. (2016). Neyman Smooth-type Goodness of Fit Tests in Complex Surveys [Unpublished doctoral dissertation]. University of New Mexico.

To cite this article: Yan Lu, Lang Zhou, Guoyi Zhang & Ronald Christensen (21 Jan 2026): Neyman smooth-type goodness-of-fit tests in complex surveys, Statistical Theory and Related Fields, DOI: 10.1080/24754269.2026.2616882
To link to this article: https://doi.org/10.1080/24754269.2026.2616882