学术报告
High-dimensional Regression Analysis with Compositional Data-占翔 副教授(北京大学)
报告题目:High-dimensional Regression Analysis with Compositional Data
报告人:占翔 副教授(北京大学)
报告时间 2023年11月28日(星期二)10:00-11:00
报告地点:教二楼323
主办单位 北京国家应用数学中心
威斯尼斯人5158cc交叉科学研究院
威斯尼斯人5158cc
联系人:方华英
摘要:It is quite common to encounter compositional data in modern data sciences. Most existing statistical methods for compositional data analysis are based on a log-ratio transformation that moves compositional data analysis from simplex to reals. Under this framework, we first investigate novel statistical methods for reliable and reproducible variable selection analysis on compositional covariates. The second part of this talk is about composition-on-composition regression. When both responses and predictors are compositional, the inventory of statistical analysis tools is surprisingly limited. Motivated by data analysis problems with high-dimensional microbiome compositional data, we propose the Nonparametric Composition-On-Composition (NCOC) regression analysis, which does not require log-ratio transformations and hence can handle excessive zeroes in microbiome data. We introduce a penalized estimation equation approach in NCOC to improve its estimation accuracy in high-dimensional settings and then establish inference procedures to quantify uncertainties in model estimation and prediction. The proposed methods are evaluated using both numerical simulations and real data applications to demonstrate its validity and superiority.
个人简介:占翔,北京大学公共卫生学院生物统计系及北京国际数学研究中心副教授。近年来一直从事生物统计,遗传统计等交叉方向的统计推断研究。先后主持2项美国国家科学基金委及美国国家卫生研究院的科研基金项目,1项国家自然科学基金面上项目,在生物统计学相关领域的国际权威期刊JASA,Biometrics,Bioinformatics,Genetics等发表科研论文近50篇,其中约一半为第一作者或通讯作者。