Review Articles

A selective review of statistical methods using calibration information from similar studies

Jing Qin ,

National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA

Yukun Liu ,

KLATASDS – MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China

ykliu@sfs.ecnu.edu.cn

Pengfei Li

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada

Pages | Received 23 Jun. 2022, Accepted 24 Jun. 2022, Published online: 27 Jul. 2022,
  • Abstract
  • Full Article
  • References
  • Citations

In the era of big data, divide-and-conquer, parallel, and distributed inference methods have become increasingly popular. How to effectively use the calibration information from each machine in parallel computation has become a challenging task for statisticians and computer scientists. Many newly developed methods have roots in traditional statistical approaches that make use of calibration information. In this paper, we first review some classical statistical methods for using calibration information, including simple meta-analysis methods, parametric likelihood, empirical likelihood, and the generalized method of moments. We further investigate how these methods incorporate summarized or auxiliary information from previous studies, related studies, or populations. We find that the methods based on summarized data usually have little or nearly no efficiency loss compared with the corresponding methods based on all-individual data. Finally, we review some recently developed big data analysis methods including communication-efficient distributed approaches, renewal estimation, and incremental inference as examples of the latest developments in methods using calibration information.