Review Articles

A short note on fitting a single-index model with massive data

Rong Jiang ,

Department of Statistics, College of Science, Donghua University, Shanghai, People's Republic of China

jrtrying@dhu.edu.cn

Yexun Peng

Department of Statistics, College of Science, Donghua University, Shanghai, People's Republic of China

Pages | Received 17 Jun. 2021, Accepted 02 Oct. 2022, Published online: 20 Oct. 2022,
  • Abstract
  • Full Article
  • References
  • Citations

This paper studies the inference problem of index coefficient in single-index models under massive dataset. Analysis of massive dataset is challenging owing to formidable computational costs or memory requirements. A natural method is the averaging divide-and-conquer approach, which splits data into several blocks, obtains the estimators for each block and then aggregates the estimators via averaging. However, there is a restriction on the number of blocks. To overcome this limitation, this paper proposed a computationally efficient method, which only requires an initial estimator and then successively refines the estimator via multiple rounds of aggregations. The proposed estimator achieves the optimal convergence rate without any restriction on the number of blocks. We present both theoretical analysis and experiments to explore the property of the proposed method.

References

  • Chen, X., Liu, W., & Zhang, Y. (2019). Quantile regression under memory constraint. The Annals of Statistics47(6), 3244–3273. https://doi.org/10.1214/18-AOS1777 
  • Chen, X. Y., & Xie, M. G. (2014). A split-and-conquer approach for analysis of extraordinarily large data. Statistica Sinica24(4), 1655–1684. https://doi.org/10.5705/ss.2013.088 
  • Christou, E., & Akritas, M. (2016). Single index quantile regression for heteroscedastic data. Journal of Multivariate Analysis150, 169–182. https://doi.org/10.1016/j.jmva.2016.05.010 
  • Delecroix, M., Hristache, M., & Patilea, V. (2006). On semiparametric M-estimation in single-index regression. Journal of Statistical Planning and Inference136(3), 730–769. https://doi.org/10.1016/j.jspi.2004.09.006 
  • Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics58(1-2), 71–120. https://doi.org/10.1016/0304-4076(93)90114-K 
  • Jiang, R., Guo, M. F., & Liu, X. (2020). Composite quasi-likelihood for single-index models with massive datasets. Communications in Statistics – Simulation and Computation51(9), 5024–5040. https://doi.org/10.1080/03610918.2020.1753074 
  • Jiang, R., Qian, W. M., & Zhou, Z. G. (2016). Weighted composite quantile regression for single-index models. Journal of Multivariate Analysis148, 34–48. https://doi.org/10.1016/j.jmva.2016.02.015 
  • Jiang, R., Zhou, Z. G., Qian, W. M., & Chen, Y. (2013). Two step composite quantile regression for single-index models. Computational Statistics & Data Analysis64, 180–191. https://doi.org/10.1016/j.csda.2013.03.014 
  • Jordan, M., Lee, J., & Yang, Y. (2019). Communication-efficient distributed statistical learning. Journal of the American Statistical Association14(526), 668–681. https://doi.org/10.1080/01621459.2018.1429274
  • Lin, N., & Xi, R. (2011). Aggregated estimating equation estimation. Statistics and Its Interface4(1), 73–83. https://doi.org/10.4310/SII.2011.v4.n1.a8 
  • Schifano, E., Wu, J., Wang, C., Yan, J., & Chen, M. H. (2016). Online updating of statistical inference in the big data setting. Technometrics58(3), 393–403. https://doi.org/10.1080/00401706.2016.1142900 
  • Tang, Y., Wang, H., & Liang, H. (2018). Composite estimation for single-index models with responses subject to detection limits. Scandinavian Journal of Statistics45(3), 444–464. https://doi.org/10.1111/sjos.v45.3 
  • Wang, J. L., Xue, L., Zhu, L., & Chong, Y. (2010). Estimation for a partial-linear single-index model. The Annals of Statistics38(1), 246–274. https://doi.org/10.1214/09-AOS712
  • Wu, T., Yu, K., & Yu, Y. (2010). Single-index quantile regression. Journal of Multivariate Analysis101(7), 1607–1621. https://doi.org/10.1016/j.jmva.2010.02.003 
  • Xia, Y., Tong, H., Li, W., & Zhu, L. X. (2002). An adaptive estimation of dimension reduction space. Journal of the Royal Statistical Society Series B64(3), 363–410. https://doi.org/10.1111/rssb.2002.64.issue-3 
  • Zhu, L., & Xue, L. (2006). Empirical likelihood confidence regions in a partially linear single-index model. Journal of the Royal Statistical Society: Series B68(3), 549–570. https://doi.org/10.1111/rssb.2006.68.issue-3

To cite this article: Rong Jiang & Yexun Peng (2023) A short note on fitting a singleindex model with massive data, Statistical Theory and Related Fields, 7:1, 49-60, DOI: 10.1080/24754269.2022.2135807 To link to this article: https://doi.org/10.1080/24754269.2022.2135807