Review Articles

Weighted average ensemble for Cholesky-based covariance matrix estimation

Xiaoning Kang ,

Institute of Supply Chain Analytics, Dongbei University of Finance and Economics, Dalian, People's Republic of China

Zhenguo Gao ,

School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, People's Republic of China

gaozheng@sjtu.edu.cn; gaozhenguo3@126.com

Xi Liang ,

School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, People's Republic of China

Xinwei Deng

Department of Statistics, Virginia Tech, Blacksburg, VA, USA

Pages | Received 27 May. 2024, Accepted 22 Mar. 2025, Published online: 09 Apr. 2025,
  • Abstract
  • Full Article
  • References
  • Citations

The modified Cholesky decomposition (MCD) is an efficient technique for estimating a covariance matrix. However, it is known that the MCD technique often requires a pre-specified variable ordering in the estimation procedure. In this work, we propose a weighted average ensemble covariance estimation for high-dimensional data based on the MCD technique. It can flexibly accommodate the high-dimensional case and ensure the positive definiteness property of the resultant estimate. Our key idea is to obtain different weights for different candidate estimates by minimizing an appropriate risk function with respect to the Frobenius norm. Different from the existing ensemble estimation based on the MCD, the proposed method provides a sparse weighting scheme such that one can distinguish which variable orderings employed in the MCD are useful for the ensemble matrix estimate. The asymptotically theoretical convergence rate of the proposed ensemble estimate is established under regularity conditions. The merits of the proposed method are examined by the simulation studies and a portfolio allocation example of real stock data.

References

  • Bickel, P. J., & Levina, E. (2008a). Covariance regularization by thresholding. The Annals of Statistics36(6), 2577–2604.
  • Bickel, P. J., & Levina, E. (2008b). Regularized estimation of large covariance matrices. The Annals of Statistics36(1), 199–227. https://doi.org/10.1214/009053607000000758
  • Bien, J., & Tibshirani, R. J. (2011). Sparse estimation of a covariance matrix. Biometrika98(4), 807–820. https://doi.org/10.1093/biomet/asr054
  • Cai, T. T., Ren, Z., & Zhou, H. H. (2016). Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electronic Journal of Statistics10(1), 1–59.
  • Cao, Y., Lin, W., & Li, H. (2019). Large covariance estimation for compositional data via composition-adjusted thresholding. Journal of the American Statistical Association114(526), 759–772. https://doi.org/10.1080/01621459.2018.1442340
  • Chang, C., & Tsay, R. S. (2010). Estimation of covariance matrix via the sparse Cholesky factor with lasso. Journal of Statistical Planning and Inference140(12), 3858–3873. https://doi.org/10.1016/j.jspi.2010.04.048
  • Dellaportas, P., & Pourahmadi, M. (2012). Cholesky-GARCH models with applications to finance. Statistics and Computing22(4), 849–855. https://doi.org/10.1007/s11222-011-9251-2
  • Deng, X., & Tsui, K. W. (2013). Penalized covariance matrix estimation using a matrix-logarithm transformation. Journal of Computational and Graphical Statistics22(2), 494–512. https://doi.org/10.1080/10618600.2012.715556
  • Furrer, R., & Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. Journal of Multivariate Analysis98(2), 227–255. https://doi.org/10.1016/j.jmva.2006.08.003
  • Huang, C., Farewell, D., & Pan, J. (2017). A calibration method for non-positive definite covariance matrix in multivariate data analysis. Journal of Multivariate Analysis157, 45–52. https://doi.org/10.1016/j.jmva.2017.03.001
  • Huang, J., Liu, N., Pourahmadi, M., & Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika93(1), 85–98. https://doi.org/10.1093/biomet/93.1.85
  • Jenny Shi, W., Hannig, J., Lai, R. C., & Lee, T. C. (2021). Covariance estimation via fiducial inference. Statistical Theory and Related Fields5(4), 316–331. https://doi.org/10.1080/24754269.2021.1877950
  • Kang, X., & Deng, X. (2020). An improved modified cholesky decomposition approach for precision matrix estimation. Journal of Statistical Computation and Simulation90(3), 443–464. https://doi.org/10.1080/00949655.2019.1687701
  • Kang, X., & Deng, X. (2021). On variable ordination of Cholesky-based estimation for a sparse covariance matrix. Canadian Journal of Statistics49(2), 283–310. https://doi.org/10.1002/cjs.v49.2
  • Kang, X., Lian, J., & Deng, X. (2025). On block Cholesky decomposition for sparse inverse covariance estimation. Statistica Sinica35(3), 1–22.
  • Kang, X., & Wang, M. (2021). Ensemble sparse estimation of covariance structure for exploring genetic disease data. Computational Statistics & Data Analysis159, 107220. https://doi.org/10.1016/j.csda.2021.107220
  • Kidd, B., & Katzfuss, M. (2022). Bayesian nonstationary and nonparametric covariance estimation for large spatial data (with discussion). Bayesian Analysis17(1), 291–351. https://doi.org/10.1214/21-BA1273
  • Lam, C., & Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Annals of Statistics37(6B), 4254–4278. https://doi.org/10.1214/09-AOS720
  • Ledoit, O., & Wolf, M. (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of Empirical Finance10(5), 603–621. https://doi.org/10.1016/S0927-5398(03)00007-0
  • Ledoit, O., & Wolf, M. (2020). Analytical nonlinear shrinkage of large-dimensional covariance matrices. The Annals of Statistics48(5), 3043–3065. https://doi.org/10.1214/19-AOS1921
  • Leng, C., & Li, B. (2011). Forward adaptive banding for estimating large covariance matrices. Biometrika98(4), 821-830. https://doi.org/10.1093/biomet/asr045
  • Li, C., Yang, M., Wang, M., Kang, H., & Kang, X. (2021). A Cholesky-based sparse covariance estimation with an application to genes data. Journal of Biopharmaceutical Statistics31(5), 603–616. https://doi.org/10.1080/10543406.2021.1931270
  • Liang, W., Zhang, Y., Wang, J., Wu, Y., & Ma, X. (2024). A new approach for ultrahigh dimensional precision matrix estimation. Journal of Statistical Planning and Inference232, 106164. https://doi.org/10.1016/j.jspi.2024.106164
  • Lv, J., Guo, C., & Wu, J. (2018). Smoothed empirical likelihood inference via the modified Cholesky decomposition for quantile varying coefficient models with longitudinal data. TEST 1-34.
  • Markowitz, H. (1952). Portfolio selection*. The Journal of Finance7(1), 77–91.
  • Pedeli, X., Fokianos, K., & Pourahmadi, M. (2015). Two Cholesky-log-GARCH models for multivariate volatilities. Statistical Modeling15(3), 233–255. https://doi.org/10.1177/1471082X14551246
  • Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika86(3), 677–690. https://doi.org/10.1093/biomet/86.3.677
  • Rajaratnam, B., & Salzman, J. (2013). Best permutation analysis. Journal of Multivariate Analysis121, 193–223. https://doi.org/10.1016/j.jmva.2013.03.001
  • Rothman, A. J., Bickel, P. J., Levina, E., & Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electronic Journal of Statistics2, 494–515. https://doi.org/10.1214/08-EJS176
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Wagaman, A., & Levina, E. (2009). Discovering sparse covariance structures with the isomap. Journal of Computational and Graphical Statistics18(3), 551–572. https://doi.org/10.1198/jcgs.2009.08021
  • Wang, S., Xie, C., & Kang, X. (2023). A novel robust estimation for high-dimensional precision matrices. Statistics in Medicine42(5), 656–675. https://doi.org/10.1002/sim.v42.5
  • Wu, W., & Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika90(4), 831–844. https://doi.org/10.1093/biomet/90.4.831
  • Xin, H., & Zhao, S. D. (2023). A compound decision approach to covariance matrix estimation. Biometrics79(2), 1201–1212. https://doi.org/10.1111/biom.13686
  • Xue, L., Ma, S., & Zou, H. (2012). Positive-definite L1-penalized estimation of large covariance matrices. Journal of the American Statistical Association107(500), 1480–1491. https://doi.org/10.1080/01621459.2012.725386
  • Zhang, Y., Shen, W., & Kong, D. (2022). Covariance estimation for matrix-valued data. Journal of the American Statistical Association117, 1–12. https://doi.org/10.1080/01621459.2021.1906684
  • Zheng, J., Huang, H., Yi, Y., Li, Y., & Lin, S. C. (2023). Barycenter estimation of positive semi-definite matrices with Bures-Wasserstein distance. arXiv:2302.14618.
  • Zheng, H., Tsui, K. W., Kang, X., & Deng, X. (2017). Cholesky-based model averaging for covariance matrix estimation. Statistical Theory and Related Fields1(1), 48–58. https://doi.org/10.1080/24754269.2017.1336831
  • Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association101(476), 1418–1429. https://doi.org/10.1198/016214506000000735

To cite this article: Xiaoning Kang, Zhenguo Gao, Xi Liang & Xinwei Deng (09 Apr 2025): Weighted average ensemble for Cholesky-based covariance matrix estimation, Statistical Theory and Related Fields, DOI: 10.1080/24754269.2025.2484979
To link to this article: https://doi.org/10.1080/24754269.2025.2484979