Review Articles

An AUC-based multi-kernel weighted support vector machine ensemble algorithm for breast cancer diagnosis

Mushuang Cheng ,

School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, People's Republic of China

Lintong Liu ,

School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, People's Republic of China

Haixiang Lin ,

Delft Institute of Applied Mathematics, Delft University of Technology, Delft, The Netherlands;c Institute of Environmental Sciences (CML), Leiden University, Leiden, The Netherlands

Guoqiang Wang

School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, People's Republic of China

guoq_wang@hotmail.com

Pages | Received 24 May. 2025, Accepted 09 Dec. 2025, Published online: 22 Dec. 2025,
  • Abstract
  • Full Article
  • References
  • Citations

Machine learning algorithms have demonstrated outstanding performance for disease diagnosis. Kernel function selection plays a crucial role in effectively transforming the nonlinear nature of input data. To enhance breast cancer diagnosis, we propose a novel ensemble algorithm, namely, AUC-Ada-𝐿1MKL-WSVM, which integrates Weighted Support Vector Machines (WSVM), AdaBoost, and Multi-Kernel Learning (MKL). This ensemble algorithm introduces two main innovations. First, it simultaneously updates the weights of training samples and the combined kernel function during classification. Second, it incorporates an AUC-based approach to adjust training sample weights, effectively controlling the growth rate of misclassified sample weights in AdaBoost. Experimental results are provided to demonstrate the effectiveness of our method, which achieves an AUC score of 97.21% and an accuracy of 97.64% on the WDBC dataset, and an AUC of 97.53% and an accuracy of 97.46% on the WBC dataset. Comparative analysis further confirms that our ensemble algorithm outperforms four benchmark models in classification accuracy.

Your browser may not support PDF viewing. Please click to download the file.

References

  • Abdar, M., Zomorodi-Moghadam, M., Zhou, X., Gururajan, R., Tao, X., P. D. Barua, & Gururajan, R. (2020). A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recognition Letters132, 123–131. https://doi.org/10.1016/j.patrec.2018.11.004
  • Akay, M. F. (2009). Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems with Applications36(2), 3240–3247. https://doi.org/10.1016/j.eswa.2008.01.009
  • Ali, L., Javeed, A., Noor, A., Rauf, H. T., Kadry, S., & Gandomi, A. H. (2024). Parkinson's disease detection based on features refinement through L1 regularized SVM and deep neural network. Scientific Reports14(1), 1333. https://doi.org/10.1038/s41598-024-51600-y
  • Aljuaid, H., Alturki, N., Alsubaie, N., Cavallaro, L., & Liotta, A. (2022). Computer-aided diagnosis for breast cancer classification using deep neural networks and transfer learning. Computer Methods and Programs in Biomedicine223,106951. https://doi.org/10.1016/j.cmpb.2022.106951
  • Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016). Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science83, 1064–1069. https://doi.org/10.1016/j.procs.2016.04.224
  • Aymaz, S. (2025). Unlocking the power of optimized data balancing ratios: A new frontier in tackling imbalanced datasets. The Journal of Supercomputing81(2), 443. https://doi.org/10.1007/s11227-025-06919-2
  • Bach, F. R., Lanckriet, G. R. G., & Jordan, M. I. (2004). Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the Twenty-First International Conference on Machine Learning (p. 6).
  • Barnett, A. J., Schwartz, F. R., Tao, C. F., Chen, C. F., Ren, Y. H., & Lo, J. Y. (2021). A case-based interpretable deep learning model for classification of mass lesions in digital mammography. Nature Machine Intelligence3(12), 1061–1070. https://doi.org/10.1038/s42256-021-00423-x
  • Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences2(1), 183–202. https://doi.org/10.1137/080716542
  • Chen, H. L., Yang, B., Liu, J., & Liu, D. Y. (2011). A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Systems with Applications38(7), 9014–9022. https://doi.org/10.1016/j.eswa.2011.01.120
  • Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics21(1), 6. https://doi.org/10.1186/s12864-019-6413-7
  • Gao, S., Lee, C.-H., & Lim, J. H. (2006). An ensemble classifier learning approach to ROC optimization. In 18th International Conference on Pattern Recognition (pp. 679–682).
  • Gao, S., & Sun, Q. B. (2007). Improving semantic concept detection through optimizing ranking function. IEEE Transactions on Multimedia9(7), 1430–1442. https://doi.org/10.1109/TMM.2007.906597
  • Ghani, M. U., Alam, T. M., & Jaskani, F. H. (2019). Comparison of classification models for early prediction of breast cancer. In 2019 International Conference on Innovative Computing (pp. 1–6).
  • Idris, N. F., & Ismail, M. A. (2021). Breast cancer disease classification using fuzzy-ID3 algorithm with FUZZYDBD method: Automatic fuzzy database definition. PeerJ Computer Science7, e427. https://doi.org/10.7717/peerj-cs.427
  • Jha, C., Li, Y., & Guha, S. (2017). Semiparametric Bayesian analysis of high-dimensional censored outcome data. Statistical Theory and Related Fields1(2), 194–204. https://doi.org/10.1080/24754269.2017.1396436
  • Kapila, R., & Saleti, S. (2023). An efficient ensemble-based machine learning for breast cancer detection. Biomedical Signal Processing and Control86,105269. https://doi.org/10.1016/j.bspc.2023.105269
  • Kashef, R. (2021). A boosted SVM classifier trained by incremental learning and decremental unlearning approach. Expert Systems with Applications167,114154. https://doi.org/10.1016/j.eswa.2020.114154
  • Kaya, Y., & Kuncan, F. (2022). A hybrid model for classification of medical data set based on factor analysis and extreme learning machine: FA+ELM. Biomedical Signal Processing and Control78,104023. https://doi.org/10.1016/j.bspc.2022.104023
  • Le, T. M., & Clarke, B. (2022). Interpreting uninterpretable predictors: Kernel methods, shtarkov solutions, and random forests. Statistical Theory and Related Fields6(1), 10–28. https://doi.org/10.1080/24754269.2021.1974157
  • Levesque, J. C., Durand, A., Gagne, C., & Sabourin, R. (2012). Multi-objective evolutionary optimization for generating ensembles of classifiers in the ROC space. In Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (pp. 879–886).
  • Li, J. L., Fine, J. P., & Pencina, M. J. (2017). Multi-category diagnostic accuracy based on logistic regression. Statistical Theory and Related Fields1(2), 143–158. https://doi.org/10.1080/24754269.2017.1319105
  • Li, X. C., Wang, L., & Sung, E. (2008). AdaBoost with SVM-based component classifiers. Engineering Applications of Artificial Intelligence21(5), 785–795. https://doi.org/10.1016/j.engappai.2007.07.001
  • Lin, C. F., & Wang, S. D. (2002). Fuzzy support vector machines. IEEE Transactions on Neural Networks13(2), 464–471. https://doi.org/10.1109/72.991432
  • Liu, N., Qi, E. S., Xu, M., Gao, B., & Liu, G. Q. (2019). A novel intelligent classification model for breast cancer diagnosis. Information Processing & Management56(3), 609–623. https://doi.org/10.1016/j.ipm.2018.10.014
  • Luo, S. H., Dai, Z. A., Chen, T. X., Chen, H. Y., & Jian, L. (2020). A weighted SVM ensemble predictor based on AdaBoost for blast furnace ironmaking process. Applied Intelligence50(7), 1997–2008. https://doi.org/10.1007/s10489-020-01662-y
  • Nanglia, S., Ahmad, M., Khan, F. A., & Jhanjhi, N. (2022). An enhanced predictive heterogeneous ensemble model for breast cancer prediction. Biomedical Signal Processing and Control72,103279. https://doi.org/10.1016/j.bspc.2021.103279
  • Nesterov, Y. (2013). Gradient methods for minimizing composite functions. Mathematical Programming140(1), 125–161. https://doi.org/10.1007/s10107-012-0629-5
  • Osman, A. H., & Aljahdali, H. M. A. (2020). An effective of ensemble boosting learning method for breast cancer virtual screening using neural network model. IEEE Access8, 39165–39174. https://doi.org/10.1109/Access.6287639
  • Polat, K., & Güneş, S. (2007). Breast cancer diagnosis using least square support vector machine. Digital Signal Processing17(4), 694–701. https://doi.org/10.1016/j.dsp.2006.10.008
  • Ragab, D. A., Attallah, O., Sharkas, M., Ren, J., & Marshall, S. (2021). A framework for breast cancer classification using multi-DCNNs. Computers in Biology and Medicine131,104245. https://doi.org/10.1016/j.compbiomed.2021.104245
  • Rakotomamonjy, A., & Grandvalet, Y. (2008). SimpleMKL. Journal of Machine Learning Research9(83), 2491–2521.
  • Ramirez-Morales, A., Salmon-Gamboa, J. U., Li, J., Sanchez-Reyna, A. G., & Palli-Valappil, A. (2023). Boosted support vector machines with genetic selection. Applied Intelligence53(5), 4996–5012.
  • Sahu, B., & Mohanty, S. N. (2021). CMBA-SVM: A clinical approach for Parkinson disease diagnosis. International Journal of Information Technology13(2), 647–655. https://doi.org/10.1007/s41870-020-00569-8
  • Sannasi Chakravarthy, S. R., Rajaguru, H., & Chidambaram, S. (2022). Processing of Wisconsin breast cancer data using Ebola optimization algorithm with mixture kernel SVM. In 2022 Smart Technologies, Communication and Robotics (pp. 1–4).
  • Sharma, A., Kaur, S., Memon, N., Fathima, A. J., Ray, S., & Bhatt, M. W. (2021). Alzheimer's patients detection using support vector machine (SVM) with quantitative analysis. Neuroscience Informatics1(3),100012. https://doi.org/10.1016/j.neuri.2021.100012
  • Škrjanc, I., Andonovski, G., Iglesias, J. A., Sanchis, A., & Lughofer, E. (2022). Evolving Gaussian on-line clustering in social network analysis. Expert Systems with Applications207,117881. https://doi.org/10.1016/j.eswa.2022.117881
  • Wang, F., Li, Z. H., He, F., Wang, R., Yu, W. Z., & Nie, F. P. (2019). Feature learning viewpoint of AdaBoost and a new algorithm. IEEE Access7,149890–149899. https://doi.org/10.1109/ACCESS.2019.2947359
  • Wang, H. F., Zheng, B. C., Yoon, S. W., & Ko, H. S. (2018). A support vector machine-based ensemble algorithm for breast cancer diagnosis. European Journal of Operational Research267(2), 687–699. https://doi.org/10.1016/j.ejor.2017.12.001
  • Wang, J. J., He, F., & Sun, S. H. (2023). Construction of a new smooth support vector machine model and its application in heart disease diagnosis. PloS ONE18(2), e0280804. https://doi.org/10.1371/journal.pone.0280804
  • Wang, S., Wang, Y., Wang, D., Yin, Y., Wang, Y., & Jin, Y. (2020). An improved random forest-based rule extraction method for breast cancer diagnosis. Applied Soft Computing86, 105941. https://doi.org/10.1016/j.asoc.2019.105941
  • Xie, X. J., Luo, K. Y., & Wang, G. Q. (2022). A new L1 multi-kernel learning support vector regression ensemble algorithm with AdaBoost. IEEE Access10, 20375–20384. https://doi.org/10.1109/ACCESS.2022.3151672
  • Yang, X., Song, Q., & Cao, A. (2005). Weighted support vector machine for data classification. In Proceedings 2005 IEEE International Joint Conference on Neural Networks (Vol. 2, pp. 859–864).
  • Zhang, J., Chen, L., Tian, J., Abid, F., Yang, W., & Tang, X. (2021). Breast cancer diagnosis using cluster-based undersampling and boosted C5.0 algorithm. International Journal of Control, Automation and Systems19(5), 1998–2008. https://doi.org/10.1007/s12555-019-1061-x
  • Zhang, X. L., & Ren, F. (2008). Improving SVM learning accuracy with AdaBoost. In 2008 Fourth International Conference on Natural Computation (pp. 221–225).
  • Zheng, B. C., Yoon, S. W., & Lam, S. S. (2014). Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms. Expert Systems with Applications41(4), 1476–1482. https://doi.org/10.1016/j.eswa.2013.08.044

To cite this article: Mushuang Cheng, Lintong Liu, Haixiang Lin & Guoqiang Wang (22 Dec 2025): An AUC-based multi-kernel weighted support vector machine ensemble algorithm for breast cancer diagnosis, Statistical Theory and Related Fields, DOI: 10.1080/24754269.2025.2603548

To link to this article: https://doi.org/10.1080/24754269.2025.2603548