National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
KLATASDS – MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
In the era of big data, divide-and-conquer, parallel, and distributed inference methods have become increasingly popular. How to effectively use the calibration information from each machine in parallel computation has become a challenging task for statisticians and computer scientists. Many newly developed methods have roots in traditional statistical approaches that make use of calibration information. In this paper, we first review some classical statistical methods for using calibration information, including simple meta-analysis methods, parametric likelihood, empirical likelihood, and the generalized method of moments. We further investigate how these methods incorporate summarized or auxiliary information from previous studies, related studies, or populations. We find that the methods based on summarized data usually have little or nearly no efficiency loss compared with the corresponding methods based on all-individual data. Finally, we review some recently developed big data analysis methods including communication-efficient distributed approaches, renewal estimation, and incremental inference as examples of the latest developments in methods using calibration information.
To cite this article: Jing Qin, Yukun Liu & Pengfei Li (2022): A selective review of statistical methods using calibration information from similar studies, Statistical Theory and Related Fields, DOI: 10.1080/24754269.2022.2096426
To link to this article: https://doi.org/10.1080/24754269.2022.2096426