Speakers

Huazhen Lin, Southwestern University of Finance and Economics

Biography: Huazhen Lin (Chinese: 林华珍)  is a professor at the School of Statistics, and the director at the Center of Statistical Research, Southwestern University of Finance and Economics. She received her Ph.D. degree from Sichuan University in China in 1999. She was awarded a fellow of the Institute of Mathematical Statistics, Distinguished Professor of Changjiang Scholars by the Chinese Ministry of Education and selected for the Ten Million Talents Project by the Ministry of Human Resources and Social Security. She was an Expert enjoying the Special Government Allowance, received State Council and the National Science Fund for Distinguished Young Scholars, NSFC, and Winner of the New Century Excellent Talents Supporting Program, named by the Chinese Ministry of Education, among others.

 

Her research interests are very board, including nonparametric method, survival data analysis, functional data analysis, space-time data analysis, transformation model, latent variable analysis, ROC method, capture and recapture data analysis. Her works have been published in top Statistics and Econometrics journals, such as the Annals of Statistics, Journal of the American Statistical Association, Journal of Econometrics, Journal of the Royal Statistical Society: Series B (Statistical Methodology), Biometrika, Biometrcs, among others.

 

Professor Lin is/was an associate editor of Biometrics, Journal of Business & Economic Statistics, Scandinavian Journal of Statistics, Canadian Journal of Statistics, Statistics and Its Interface and Statistical Theory and Related Fields. She is also a member of the Editorial Board in the Acta Mathematica Sinica (English Series), Chinese Journal of Applied Probability and Statistics, Journal of Systems Science and Mathematical Sciences and Journal of Applied Statistics and Management, which are key academic research journals in China.

 

TitleGeneralized factor model for ultra-high dimensional correlated variables with mixed types

Abstract:  As high-dimensional data measured with mixed-type variables gradually become prevalent, it is particularly appealing to represent those mixed-type high-dimensional data using a much smaller set of so-called factors. Due to the limitation of the existing methods for factor analysis that deal with only continuous variables, in this paper, we develop a generalized factor model, a corresponding algorithm and theory for ultra-high dimensional mixed types of variables where both the sample size $n$ and variable dimension $p$ could diverge to infinity. Specifically, to solve the computational problem arising from the non-linearity and mixed types, we develop a two-step algorithm so that each update can be carried out in parallel across variables and samples by using an existing package. Theoretically, we establish the rate of convergence for the estimators of factors and loadings in the presence of nonlinear structure accompanied with mixed-type variables when both $n$ and $p$ diverge to infinity. Moreover, since the correct specification of the number of factors is crucial to both the theoretical and the empirical validity of factor models, we also develop a criterion based on a penalized loss to consistently estimate the number of factors under the framework of a generalized factor model. To demonstrate the advantages of the proposed method over the existing ones, we conducted extensive simulation studies and also applied it to the analysis of the NFBC1966 dataset and a cardiac arrhythmia dataset, resulting in more predictive and interpretable estimators for loadings and factors than the existing factor model.

 

Key words: Generalized factor model; Nonlinear; Ultra-high dimension; mixed type of data; Uniform consistency; Convergence rate