Variable selection and subgroup analysis for high-dimensional censored data

ISSN 2475-4269

CN 31-2182/O1

Jiangli Wang ,

School of Artificial Intelligence and Big Data, Hefei University, Hefei, People's Republic of China

Weiping Zhang

Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, People's Republic of China

zwp@ustc.edu.cn

Pages 211-231 | Received 24 Jul. 2023, Accepted 29 Feb. 2024, Published online: 13 Mar. 2024,

Abstract
Full Article
References
Citations

This paper proposes a penalized method for high-dimensional variable selection and subgroup identification in the Tobit model. Based on Olsen's [(1978). Note on the uniqueness of the maximum likelihood estimator for the Tobit model. Econometrica: Journal of the Econometric Society, 46(5), 1211–1215. https://doi.org/10.2307/1911445] convex reparameterization of the Tobit negative log-likelihood, we develop an efficient algorithm for minimizing the objective function by combining the alternating direction method of multipliers (ADMM) and generalised coordinate descent (GCD). We also establish the oracle properties of our proposed estimator under some mild regularity conditions. Furthermore, extensive simulations and an empirical data study are conducted to demonstrate the performance of the proposed approach.

References

Alhamzawi, A. (2020). A new Bayesian elastic net for Tobit regression. Journal of Physics: Conference Series, 1664(1), 012047.
Alhamzawi, R. (2016). Bayesian elastic net Tobit quantile regression. Communications in Statistics-Simulation and Computation, 45(7), 2409–2427. https://doi.org/10.1080/03610918.2014.904341
Amemiya, T. (1973). Regression analysis when the dependent variable is truncated normal. Econometrica: Journal of the Econometric Society, 41(6), 997–1016. https://doi.org/10.2307/1914031
Amemiya, T. (1984). Tobit models: A survey. Journal of Econometrics, 24(1–2), 3–61. https://doi.org/10.1016/0304-4076(84)90074-5
Bondell, H. D., & Reich, B. J. (2008). Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics, 64(1), 115–123. https://doi.org/10.1111/biom.2008.64.issue-1
Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122. https://doi.org/10.1561/2200000016
Bradic, J., Fan, J., & Jiang, J. (2011). Regularization for Cox's proportional hazards model with NP-dimensionality. Annals of Statistics, 39(6), 3092. https://doi.org/10.1214/11-AOS911
Dagne, G. A. (2016). A growth mixture Tobit model: Application to AIDS studies. Journal of Applied Statistics, 43(7), 1174–1185. https://doi.org/10.1080/02664763.2015.1092114
Everitt, B. (2013). Finite mixture distributions. Springer Science & Business Media.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Fan, J., & Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20(1), 101–148.
Fan, Y., & Tang, C. (2013). Tuning parameter selection in high dimensional penalized likelihood. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(3), 531–552. https://doi.org/10.1111/rssb.12001
Gandhi, R. T., Tashima, K. T., Smeaton, L. M., Vu, V., Ritz, J., Andrade, A., Eron, J. J., Hogg, E., & Fichtenbaum, C. J. (2020). Long-term outcomes in a large randomized trial of HIV-1 salvage therapy: 96-week results of AIDS Clinical Trials Group A5241 (OPTIONS). The Journal of Infectious Diseases, 221(9), 1407–1415. https://doi.org/10.1093/infdis/jiz281
Jacobson, T., & Zou, H. (2023). High-dimensional censored regression via the penalized Tobit likelihood. Journal of Business & Economic Statistics, 42(1), 286–297. https://doi.org/10.1080/07350015.2023.2182309
Johnson, B. A. (2009). On lasso for censored data. Electronic Journal of Statistics, 3, 485–506. https://doi.org/10.1214/08-EJS322
Ma, S., & Huang, J. (2017). A concave pairwise fusion approach to subgroup analysis. Journal of the American Statistical Association, 112(517), 410–423. https://doi.org/10.1080/01621459.2016.1148039
Ma, S., Huang, J., Zhang, Z., & Liu, M. (2019). Exploration of heterogeneous treatment effects via concave fusion. The International Journal of Biostatistics, 16(1), 20180026. https://doi.org/10.1515/ijb-2018-0026
Müller, P., & van de Geer, S. (2016). Censored linear model in high dimensions: Penalised linear regression on high-dimensional data with left-censored response variable. Test, 25(1), 75–92. https://doi.org/10.1007/s11749-015-0441-7
Olsen, R. J. (1978). Note on the uniqueness of the maximum likelihood estimator for the Tobit model. Econometrica: Journal of the Econometric Society, 46(5), 1211–1215. https://doi.org/10.2307/1911445
Powell, J. L. (1984). Least absolute deviations estimation for the censored regression model. Journal of Econometrics, 25(3), 303–325. https://doi.org/10.1016/0304-4076(84)90004-6
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850. https://doi.org/10.1080/01621459.1971.10482356
Shafer, R. W. (2006). Rationale and uses of a public HIV drug-resistance database. The Journal of Infectious Diseases, 194(s1), S51–S58. https://doi.org/10.1086/jid.2006.194.issue-s1
Shen, J., & He, X. (2015). Inference for subgroup analysis with a structured logistic-normal mixture model. Journal of the American Statistical Association, 110(509), 303–312. https://doi.org/10.1080/01621459.2014.894763
Soret, P., Avalos, M., Wittkop, L., Commenges, D., & Thiébaut, R. (2018). Lasso regularization for left-censored Gaussian outcome and high-dimensional predictors. BMC Medical Research Methodology, 18(1), 1–13. https://doi.org/10.1186/s12874-018-0609-4
Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica: Journal of the Econometric Society, 26(1), 24–36. https://doi.org/10.2307/1907382
Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications, 109(3), 475–494. https://doi.org/10.1023/A:1017501703105
Wang, H., Li, B., & Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(3), 671–683. https://doi.org/10.1111/j.1467-9868.2008.00693.x
Wang, H., Li, R., & Tsai, C. L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94(3), 553–568. https://doi.org/10.1093/biomet/asm053
Wang, X., Zhu, Z., & Zhang, H. H. (2019). Spatial automatic subgroup analysis for areal data with repeated measures. arXiv:1906.01853.
Yan, X., Yin, G., & Zhao, X. (2021). Subgroup analysis in censored linear regression. Statistica Sinica, 31(2), 1027–1054.
Yang, Y., & Zou, H. (2013). An efficient algorithm for computing the HHSVM and its generalizations. Journal of Computational and Graphical Statistics, 22(2), 396–415. https://doi.org/10.1080/10618600.2012.680324
Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 32(2), 894–942.
Zhao, P., & Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research, 7(90), 2541–2563.
Zhou, X., & Liu, G. (2016). LAD-lasso variable selection for doubly censored median regression models. Communications in Statistics-Theory and Methods, 45(12), 3658–3667. https://doi.org/10.1080/03610926.2014.904357
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735

To cite this article: Yu Zhang, Jiangli Wang & Weiping Zhang (13 Mar 2024): Variable selection and subgroup analysis for high-dimensional censored data, Statistical Theory and Related Fields, DOI: 10.1080/24754269.2024.2327113

To link to this article: https://doi.org/10.1080/24754269.2024.2327113

Archives

References

Authors

About the Journal

Links

Search

Archives