Runze Li, Ph.D.

Runze Li, Ph.D. Prinicpal Investigator, The Methodology Center

Professor, Statistics

Professor, Public Health Sciences

 

The Methodology Center

The Pennsylvania State University

400 Calder Square II

State College, PA 16801

 

814-863-9481

Website

 

 

Education

Ph.D., University of North Carolina at Chapel Hill, 2000 (Statistics)

 

 

Research Interests

My research is primarily focused on the fields of variable selection, local modeling; I am also interested in functional data analysis and experimental design.

 

Variable selection is fundamental to statistical modeling. Many approaches in use are stepwise selection procedures, such as best subset variable selection and stepwise backward elimination, which can be expensive in computation and can ignore stochastic errors in the variable selection process. In my work, new approaches are proposed to select significant variables for various statistical models. Based on penalized likelihood, the proposed approaches delete insignificant covariates by estimating their coefficients to be zero, and therefore simultaneously select significant variables and estimate parameters. We have shown that the proposed approaches have oracle properties; namely, they work as well as if the correct submodel were known.

 

I am also interested in the topic of functional data analysis. Functional data are also called curve data; longitudinal data, repeated measurements and growth curves are special cases of functional data. In my work, local likelihood methods were used for efficient estimation of parameters for various nonparametric regression models used in functional data analysis. Further, generalized likelihood ratio tests are proposed for goodness-of-fit tests on these models.

Honors and Awards

 

 

Selected Grants

Center for Prevention and Treatment Methodology

National Institute on Drug Abuse P50 (Renewal)

2010-2015;

Role: Principal Investigator

 

New Directions in Statistical Robust Modeling and their Application

National Natural Science Foundation of China, 11028103

2011-2012; Role: Principal Investigator (Co-PI: Hengjian Cui)

 

New Statistical Models for Intensive Longitudinal Data

National Institutes of Health Roadmap R21

2007-2012; Role: Principal Investigator

 

 

Current Projects and Collaborators

I am working on several related projects that apply nonparametric and semiparametric statistical modeling techniques to the analysis of intensive longitudinal data including ecological momentary assessment data. My collaborators on these projects are Lisa Dieker (Wesleyan University), Xianming Tan (McGill University), Mariya Shiyko (Northeastern University), Anne Buu (University of Michigan), Linda M. Collins, Stephanie Lanza, John Dziak, and Jingyun (Michael) Yang.

 

 

Selected Publications

Peer-Reviewed Articles

Wang, L., Wu, Y., & Li, R. (in press). Quantile regression for analyzing heterogeneity in ultra-high dimension. Journal of the American Statistical Association.

Tan, X., Shiyko, M., Li, R., Li, Y., & Dierker, L. (2012). A time-varying effect model for intensive longitudinal data. Psychological Methods. Advance online publication. doi: 10/1037/a0025814 PMCID: PMC32885.

Wang, Y., Huang, C., Fang, Y., Yang, Q., & Li, R. (2012). Flexible semiparametric analysis of longitudinal genetic studies by reduced rank smoothing. Journal of the Royal Statistical Society, Series C, 61, 1-24.

Shiyko, M. P., Lanza, S. T., Tan, X., Li, R., & Shiffman, S. (2012). Using the time-varying effect model (TVEM) to examine dynamic associations between negative affect and self confidence on smoking urges: Differences between successful quitters and relapsers. Prevention Science. Advance online publication. doi: 10.1007/s11121-011-0264-z PMCID:PMC3171604

Tan, X., Dierker, L., Rose, J., Li, R. & The Tobacco Etiology Research Network(TERN). (2011). How spacing of data collection may impact estimates of substance use trajectories. Substance Use and Misuse, 46(6), 758-768. PMCID: PMC3107528

Zhu, L., Li, L., Li, R., & Zhu, L.-X. (2011). Model-free feature screening for ultrahigh dimensional data. Journal of the American Statistical Association, 106, 1464-1475.

Buu, A., Johnson, N. J., Li, R., & Tan, X. (2011). New variable selection methods for zero-inflated count data with applications to the substance abuse field. Statistics in Medicine, 30, 2326-2340. PMCID: PMC3133860

Zhu, H., Kong, L., Li, R., Styner, M., Gerig, G., Lin, W., & Gilmore, J.H. (2011). FADTTS: Functional analysis of diffusion tensor tract statistics. Neuroimage 56(3), 1412-1425. PMCID:PMC3085665

Kim, K., Senturk, D., & Li, R. (2011). Recent history functional linear models for sparse longitudinal data. Journal of Statistical Planning and Inference 141(4), 1554-1566. PMCID:PMC3117473

Wang, Y., Xu, M., Wang, Z., Tao, M., Zhu, J., Li, R., Wang, L., Berceli, S. A., & Wu, R. (2011). How to cluster gene expression dynamics in response to environmental signals. Briefings in Bioinformatics, 13, 162-174. PMCID: PMC3294239

Kai, B., Li, R., & Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Annals of Statistics, 39, 305-332.

Li, J., Das, K., Fu, G., Li, R., & Wu, R. (2011). The Bayesian LASSO for genome-wide association studies. Bioinformatics 27, 516-523. PMCID: PMC3105480

Fu, G., Wang, Z., Li, J., Das, K., Li, R., & Wu, L. (2011). Integrating ordinary differential equations into functional mapping of biological rhythms. Journal of Biological Dynamics, 5, 84-101.

Zhu, L., Huang, M., & Li, R. (2011). Semiparametric quantile regression with high dimensional covariates. Statistica Sinica. Advance online publication. doi: 10.5705/ss.2010.199

Dierker, L., Rose, J., Tan, X., & Li, R. (2010). Uncovering multiple pathways to substance use: A comparison of methods for identifying population subgroups. The Journal of Primary Prevention, 31(5-6), 333-348. PMCID: PMC3107529

Feng, Y., Li, R., Sudjianto, A., & Zhang, Y. (2010). Robust neural network with applications to analysis of credit portfolio data. Statistics and Its Interface 3(4), 437-444.

Ma, Y., & Li, R. (2010). Variable selection in measurement error models. Bernoulli, 16(1), 274-300. PMCID: PMC2832228

Kai, B., Li, R., & Zou, H. (2010). Local CQR smoothing: An efficient and safe alternative to local polynomial regression. Journal of the Royal Statistical Society, Series B, 72, 49-69. PMCID: PMC2958780

Yin, J., Geng, Z., Li, R., & Wang, H. (2010). Nonparametric covariance model. Statistica Sinica, 20, 469-479. PMCID: PMC3002111

Liang, H., Liu, X., Li, R., & Tsai, C.-L. (2010). Estimation and testing for partially linear single-index models. Annals of Statistics, 38I, 3811-3836.

Wang, L., Kai, B., & Li, R. (2009). Local rank inference for varying coefficient models. Journal of the American Statistical Association, 104(488), 1631-1645. PMCID: PMC2908045

Liang, H., & Li, R. (2009). Variable selection for partially linear models with measurement errors. Journal of the American Statistical Association, 104(485), 234-248. PMCID: PMC2697854

Wang, L., & Li, R. (2009). Weighted Wilcoxon-type smoothly clipped absolute deviation method. Biometrics, 65(2), 564-571. PMCID: PMC2700846

Collins, L. M., Dziak, J. J., & Li, R. (2009). Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. Psychological Methods, 14(3), 202-224. PMCID: PMC2796056

Li, R., & Nie, L. (2008). Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics, 64, 904-911. PMCID: PMC2679946

Zou, H., & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Annals of Statistics, 36, 1509-1566. PMCID: PMC2759727

Li, R., & Liang, H. (2008). Variable selection in semiparametric regression modeling. Annals of Statistics, 36, 261-286. PMCID: PMC2605629

Wang, H., Li, R., & Tsai, C.-L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika, 94, 553-568. PMCID: PMC2663963

Fan, J., Huang, T., & Li, R. (2007). Analysis of longitudinal data with semiparametric estimation of covariance function. Journal of the American Statistical Association, 102, 632-641. PMCID: PMC2730591

Fan, J., & Li, R. (2006). Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery. Proceedings of the International Congress of Mathematicians (M. Sanz-Sole, J. Soria, J.L. Varona, J. Verdera, eds.) , Vol. III, European Mathematical Society, Zurich, 595-622.

Qu, A., & Li, R. (2006). Nonparametric modeling and inference function for longitudinal data. Biometrics, 62, 379-391. PMCID: PMC2680010

Zhang, A., Fang, K.-T., Li, R., & Sudjianto, A. (2005). Majorization framework for fractional factorial designs. Annals of Statistics, 33, 2837-2853.

Hunter, D., & Li, R. (2005).  Variable selection using MM algorithms. Annals of Statistics, 33, 1617-1642. PMCID: PMC2674769

Cai, J., Fan, J., Li, R., & Zhou, H. (2005). Variable selection for multivariate failure time data. Biometrika, 92, 303-316. PMCID: PMC2674767

Li, R., & Sudjianto, A. (2005). Analysis of computer experiments using penalized likelihood in Gaussian kriging Models. Technometrics, 47, 111-120.

Li, R., & Chow, M., (2005). Evaluation of reproducibility for paired functional data. Journal of Multivariate Analysis, 93, 81-101. PMCID: PMC2674768

Fan, J., & Li, R., (2004). New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of the American Statistical Association, 99, 710-723.

Fan, J., & Li, R., (2002). Variable Selection for Cox's Proportional Hazards Model and Frailty Model. Annals of Statistics, 30, 74-99.

Fan, J., & Li, R., (2001). Variable selection via nonconcave penalized likelihood and it oracle properties, Journal of the American Statistical Association, 96, 1348-1360.

Liang, J., Fang, K.T., Hickernell, F., & Li, R., (2001). Testing multivariate uniformity and its applications. Mathematics of Computation, 70, 337-355.

Cai, Z., Fan, J., & Li, R., (2000). Efficient estimation and inferences for varying coefficient models. Journal of the American Statistical Association, 95, 888-902.

 

Presentations

Li, R. (2011, November). Feature screening via distance correlation learning. Presented at the Department of Operation Research and Financial Engineering, Princeton University.Li, R. (2011, October). High and ultrahigh dimensional data analysis. Lecture presented at the Institute of applied mathematics, Chinese Academy of Sciences, jointly with Capital Normal University and Beijing Normal University.

 Li, R. (2011, October). Variable selection and regularization methods. Lecture presented at the Capital Normal University jointly with the Beijing Normal University and the Institute of Applied Mathematics, Chinese Academy of Sciences.

Li, R. (2011, August). Sparse quantile regression Approach for analyzing heterogeneity in ultrahigh dimension. Paper presented at the Joint Statistical Meeting, Miami, FL.

Li, R. (2011, July). Sparse quantile regression approach for analyzing heterogeneity in ultrahigh dimension. Paper presented at the First Wu Xi International Statistics Forum, Wuxi, P.R., China.

Dziak, J. J., Huang, L., Lanza, S. T., Li, R., Collins, L. M., & Xu, S. (2011, June). Software advances from the Methodology Center at Penn State. Technology demonstration presented at the Society for Prevention Research Annual Meetings, Washington, DC.

Dziak, J. J., Coffman, D. L., Lanza, S. T., & Li, R. (2011, June). Sensitivity and specificity of information criteria for model selection in prevention and psychology datasets. Poster prsented at the Society for Prevention Research Annual Meetings, Washington, DC.

Shiyko, M., Lanza, S. T., Tan, X., Shiffman, S., & Li, R. (2011, June).  Between-group differences in temporal dynamics of negative affect, self-confidence, and smoking urges in short-term successful quitters and relapsers: Applications of the model with varying effects (MOVE). In M. Shiyko (Chair), Applications of novel methods for analysis of intensive longitudinal data in studies on drug use. Symposium presented at the Society for Prevention Research Annual Meetings, Washington, DC.

Shiyko, M., Li, R., Lin, J., & Ostroff, J. (2011, June). Joint modeling of longitudinal trajectories and time-to-event analysis in the presence of drop-out. Poster presented at the Society for Prevention Research Annual Meetings, Washington, DC.

Tan, X., Shiyko, M., Li, R., Li, Y., & Dierker, L. (2011, June). Model with varying effects (MOVE) for describing time-varying relationship in covariates in intensive longitudinal data studies: Learning to ask new research questions. Poster presented at the Society for Prevention Research Annual Meetings, Washington, DC.

Li, R. Model free feature screening for high-dimensional data. Presented at Joint Statistical Meetings 2010, Vancouver, BC (2010, August). Department of Statistics, University of Wisconsin at Madison (2010, October).

Collins, L. M., Dziak, J. J., Huang, L., Lanza, S. T., Li, R., & Tan, X. (2010, June). New development of statistical software for prevention research in the Methodology Center. Technology demonstration presented at the annual meeting of the Society for Prevention Research, Denver, CO.

 

Software and Documentation

TVEM SAS Macro Suite (Version 2.0.0) [Software]. (2012). University Park: The Methodology Center, Penn State. Retrieved from http://methodology.psu.edu

Yang, J., Tan, X., Li, R., & Wagner, A. (2012). TVEM (time-varying effect model) SAS macro suite users' guide (Version 2.0.0). University Park: The Methodology Center, Penn State. Retrieved from http://methodology.psu.edu

Li, R., & Tan, X. (2011). TVEM (Time‐Varying effect model) SAS macro users' guide (Version 1.2.0). University Park: The Methodology Center, Penn State. Retrieved from http://methodology.psu.edu

TVEM (Time‐Varying Effect Model) SAS macro (Version 1.2.0) [Software]. (2011). University Park: The Methodology Center, Penn State. Retrieved from

http://methodology.psu.edu

Dziak, J. J., Lemmon, D. R., Li, R., & Huang, L. (2010, May). PROC SCADGLIM User's Guide Version 1.1 beta. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Dziak, J. J., Lemmon, D. R., Li, R., & Huang, L. (2010, May). PROC SCADLS User's Guide Version 1.1. beta. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Li, R. (2010, January). SAS Macros for Estimation Functional Hierarchical Linear Models (FHLM) Using Local Linear Regression Estimation Procedure: %FHLMLLR Version 1.1. [Software]. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Li, R. & Tan, X. (2010, January). %FHLMLLR User's Guide Version 1.1. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Li, R. & Tan, X. (2010, January). MOVEPSPline and MOVEBSpline User’s Guide Version 1.1. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Tan, X. & Li, R. (2010, January). SAS Macro for Estimation of Model with Varying Effect (MOVE): %MOVEBSpline Version 1.1. [Software]. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Tan, X. & Li, R. (2010, January). SAS Macro for Estimation of Model with Varying Effect (MOVE): %MOVEPSpline Version 1.1. [Software]. University Park: The Methodology Center, Penn State. Available at http://methodology.psu.edu.

Follow Us