Review Articles

On Kullback-Leibler divergence for medical diagnostics accuracy and cut-point selection criterion under tree or umbrella ordering

Hani M. Samawi ,

Department of Biostatistics, Epidemiology, and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA

hsamawi@georgiasouthern.edu

Marwan Alsharman ,

Department of Biostatistics, Epidemiology, and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA

Jing Kersey

Department of Biostatistics, Epidemiology, and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA

Pages | Received 23 Jul. 2025, Accepted 23 Apr. 2026, Published online: 08 May. 2026,
  • Abstract
  • Full Article
  • References
  • Citations

Diagnostic testing typically involves two types of classification. Binary tests separate individuals into diseased or non-diseased groups, while multi-class methods, like tree or umbrella ordering, compare one class's biomarker levels to those of other classes. Kullback-Leibler divergence (KL), which measures the difference between two distributions, has been considered a valuable index for assessing the diagnostic performance of biomarkers. In this work, we derive and propose the total rule-in and rule-out Kullback-Leibler divergence (TTKL(c)) as a measure of accuracy, obtained by dichotomizing a continuous biomarker, and as an optimization criterion for cut-off point selection under tree or umbrella ordering. We have established a connection between the proposed TTKL(c) measure and the extended Youden index, which is the most used criterion for cut-off point selection. Additionally, we present both theoretical and numerical derivations for scenarios involving a single cut-off point under extended tree ordering. Graphically, KL divergence is represented through the information graph. Using simulation methods, we conducted a power study to compare the performance of our proposed methods with the extended Youden index under tree ordering, as well as the accuracy of optimal cut-off selection. This analysis provides insights into the effectiveness of TTKL(c) as a robust criterion for selecting cut-off points in multi-class diagnostic settings. A comprehensive data analysis of lung cancer data illustrates the proposed applications.

Your browser may not support PDF viewing. Please click to download the file.

References

  • Benish, W. (2002). The use of information graphs to evaluate and compare diagnostic tests. Methods of Information in Medicine41(2), 114–118. https://doi.org/10.1055/s-0038-1634294
  • Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., Loda, M., Weber, G., E. J. Mark, Lander, E. S., Wong, W., Johnson, B. E., Golub, T. R., Sugarbaker, D. J., & Meyerson, M. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences98(24), 13790–13795. https://doi.org/10.1073/pnas.191502998
  • Bregman, L. M. (1967). The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics7(3), 200–217. https://doi.org/10.1016/0041-5553(67)90040-7
  • Guinney, J., Dienstmann, R., Wang, X., de Reyniès, A., Schlicker, A., Soneson, C., Marisa, L., Roepman, P., Nyamundanda, G., Angelino, P., Bot, B. M., Morris, J. S., Simon, I. M., Gerster, S., Fessler, E., Melo, F. D. S. E., Missiaglia, E., Ramay, H., Barras, D., …Homicsko, K. (2015). The consensus molecular subtypes of colorectal cancer. Nature Medicine21(11), 1350–1356. https://doi.org/10.1038/nm.3967
  • Hughes, G. (2013). Information graphs for epidemiological applications of the Kullback-Leibler divergence. Methods of Information in Medicine53(1), IV–VI.
  • Hughes, G., & Bhattacharya, B. (2013). Symmetry properties of bi-normal and bi-gamma receiver operating characteristic curves are described by Kullback-Leibler divergences. Entropy15(4), 1342–1356. https://doi.org/10.3390/e15041342
  • Knottnerus, J. A., & Muris, J. W. (2003). Assessment of the accuracy of diagnostic tests: The cross-sectional study. Journal of Clinical Epidemiology56(11), 1118–1128. https://doi.org/10.1016/S0895-4356(03)00206-3
  • Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics22(1), 79–86. https://doi.org/10.1214/aoms/1177729694
  • Lee, W. C. (1999). Selecting diagnostic tests for ruling out or ruling in disease: The use of the Kullback-Leibler distance. International Journal of Epidemiology28(3), 521–525. https://doi.org/10.1093/ije/28.3.521
  • Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. Oxford University Press.
  • Rudin, W. (1976). Principles of mathematical analysis (3rd ed.). McGraw-Hill Book.
  • Sackett, D. L., Haynes, R. B., Guyatt, G. H., & Tugwell, P. (1991). Clinical epidemiology: A basic science for clinical medicine. Little, Brown and Co.
  • Samawi, H., Alsharman, M., Keko, M., & Kersey, J. (2023). Post-test diagnostic accuracy measures under tree ordering of disease classes. Statistics in Medicine42(28), 5135–5159. https://doi.org/10.1002/sim.v42.28
  • Samawi, H., Yin, J., Zhang, X., Yu, L., Rochani, H., Vogel, R., & Mo, C. (2020). Kullback-Leibler divergence for medical diagnostics accuracy and cut-point selection criterion: How it is related to the Youden index. Journal of Applied Bioinformatics & Computational Biology9(2), 1–10. https://doi.org/10.37532/jabcb.2020.9(2).168
  • Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communication. University of Illinois Press.
  • Soofi, E. S., Ebrahimi, N., & Habibullah, M. (1995). Information distinguishability with application to the analysis of failure data. Journal of the American Statistical Association90(430), 657–668. https://doi.org/10.1080/01621459.1995.10476560
  • van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge University Press.
  • Wang, D., Attwooda, K., & Tiana, L. (2016). Receiver operating characteristic analysis under tree orderings of disease classes. Statistics in Medicine35(11), 1907–1926. https://doi.org/10.1002/sim.v35.11
  • Wang, D., Feng, Y., Attwood, K., & Tian, L. (2019). Optimal threshold selection methods under tree or umbrella ordering. Journal of Biopharmaceutical Statistics29(1), 98–114. https://doi.org/10.1080/10543406.2018.1489410

To cite this article: Hani M. Samawi, Marwan Alsharman & Jing Kersey (2026) On Kullback Leibler divergence for medical diagnostics accuracy and cut-point selection criterion under tree or umbrella ordering, Statistical Theory and Related Fields, 10:2, 285-307, DOI: 10.1080/24754269.2026.2665851 To link to this article: https://doi.org/10.1080/24754269.2026.2665851