Review Articles

Power analysis for stratified cluster randomisation trials with cluster size being the stratifying factor

Jijia Wang ,

Department of Statistical Science, Southern Methodist University, Dallas, TX, USA

Song Zhang ,

Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA

Chul Ahn

Department of Clinical Sciences, UT Southwestern Medical Center, Dallas, TX, USA

Pages 121-127 | Received 01 Apr. 2017, Accepted 22 Jun. 2017, Published online: 18 Jul. 2017,
  • Abstract
  • Full Article
  • References
  • Citations


Stratified cluster randomisation trial design is widely employed in biomedical research and cluster size has been frequently used as the stratifying factor. Conventional sample size calculation methods have assumed the cluster sizes to be constant within each stratum, which is rarely true in practice. Ignoring the random variability in cluster size leads to underestimated sample sizes and underpowered clinical trials. In this study, we proposed to directly incorporate the variability in cluster size (represented by coefficient of variability) into sample size calculation. This approach provides closed-form sample size formulas, and is flexible to accommodate arbitrary randomisation ratio and varying numbers of clusters across strata. Simulation study shows that the proposed approach achieves desired power and type I error over a wide spectrum of design configurations, including different distributions of cluster sizes. An application example is presented.


  1. Ahn, C. (1997). An evaluation of simple methods for the estimation of a common odds ratio in clusters with variable size. Computational statistics & data analysis, 24(1), 4761[Google Scholar]
  2. Bland, J. M. (2004). Cluster randomised trials in the medical literature: Two bibliometric surveys. BMC Medical Research Methodology, 4(21), e1471-2288-4-21[Google Scholar]
  3. Bradley, C. (Ed.). (1994). Handbook of psychology and diabetes: A guide to psychological measurement in diabetes research and management. Chur: Hardwood Academic[Google Scholar]
  4. Crowder, M. (1995). On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika, 82, 407410[Google Scholar]
  5. Donner, A., & Klar, N. (1996). Statistical considerations in the design and analysis of community intervention trials. Journal of Clinical Epidemiology, 49(4), 435439[Google Scholar]
  6. Donner, A., & Klar, N. (2000). Design and analysis of cluster randomization trials in health research. London: Arnold[Google Scholar]
  7. Lewsey, J. D. (2004). Comparing completely and stratified randomized designs in cluster randomized trials when the stratifying factor is cluster size: A simulation study. Statistics in Medicine, 23(6), 897905[Google Scholar]
  8. Lauer, S. A., Kleinman, K. P., & Reich, N. G. (2015). The effect of cluster size variability on statistical power in cluster-randomized trials. PloS One, 10(4), e0119074[Google Scholar]
  9. Liang, K., & Zeger, S. L. (1986). Longitudinal data analysis for discrete and continuous outcomes using generalized linear models. Biometrika, 84, 332[Google Scholar]
  10. Manatunga, A. K., Hudgens, M. G., & Chen, S. (2001). Sample size estimation in cluster randomized studies with varying cluster size. Biometrical Journal, 43(1), 7586[Google Scholar]
  11. Mantel, N. (1963). Chi-square tests with one degree of freedom, extensions of the Mantel-Haenszel procedure. Journal of the American Statistical Association, 58(303), 690700[Taylor & Francis Online], [Google Scholar]
  12. McDonald, B. W. (1993). Estimating logistic regression parameters for bivariate binary data. Journal of the Royal Statistical Society, Series. B, 55, 391397[Google Scholar]
  13. Rutterford, C., Copas, A., & Eldridge, S. (2015). Methods for sample size determination in cluster randomized trials. International Journal of Epidemiology, 44(3), 10571067[Google Scholar]
  14. Speigel, M. R. (1975). Theory and problems of probability and statistics (Schaum's Outline Series). McGraw-Hill Book Co, India[Google Scholar]
  15. Woolson, R. F., Bean, J. A., & Rojas, P. B. (1986). Sample size for case-control studies using Cochran's statistic. Biometrics, 42(4), 927932[Google Scholar]
  16. Zou, G., & Donner, A. (2004). Confidence interval estimation of the intraclass correlation coefficient for binary outcome data. Biometrics, 60(3), 807811[Google Scholar]