Statistics Seminar

R.L. Anderson Lecture - Unlikely Likelihoods

Abstract: The celebrated, and much maligned, method of maximum likelihood is enjoying a modest revival for semiparametric models. For shape constrained density and regression problems, and for mixture models and compound decision problems nonparametric maximum likelihood offers an efficient and tuning-parameter-free computational strategy. Reweighted forms of quantile regression that borrow strength across nearby quantiles can also be formulated in likelihood terms. Several variants of these unusual likelihoods will be reviewed, stressing some open problems.

Itinerary for the lecture:

3:30pm - Announcement the Anderson Award Winners

3:40pm - 'Remembering R.L. Anderson' by Dr. David Allen

3:50pm - Introduction of Dr. Koenker by Dr. Carlos Lamarche

4:00pm - R.L. Anderson Lecture - Unlikely Likelihoods

5:00pm - Event Wrap-Up

Date:

Friday, April 22, 2022 - 03:30 pm - Friday, April 22, 2022 - 05:00 pm

Location:

https://uky.zoom.us/j/83687624347?pwd=eWN1NGFUeUpJbnBoVnFQbXRiZHgvZz09

Event Series:

Statistics Seminar

Read more about R.L. Anderson Lecture - Unlikely Likelihoods

Genome-wide association and ancestry analysis in admixed populations

Abstract: Admixed populations (such as African Americans and Latinos) are of special interest in human genetics because they allow us to test for both ancestry effects via admixture mapping and genotype effects via association mapping, especially diseases such as asthma that show racial differences in prevalence and allele frequency across populations. Despite advances in asthma care, African Americans are four times more likely to be hospitalized and five times more likely to die from asthma than European Americans. However, most genetic studies have primarily been conducted in European ancestry populations.

In this study, 1600 asthmatic and 1000 healthy controls of African American samples were genotyped using the Multi-Ethnic Genotyping Array (MEGA), genotype imputation was carried out with TOPMed reference panel using the Michigan Imputation Server. Global and local ancestry estimation were carried out using the RFMix v2, a powerful discriminative modeling approach. Logistic regression models using the number of copies of local ancestry at each locus and the binary asthma-genotype correlation adjusted for the covariates were used for admixture mapping and association mapping, respectively. Joint admixture and association were tested using the BMIX, an approach that combines admixture and association statistics at single markers.

Both admixture and association analysis identified novel genetic risk variants associated with asthma including loci specific to African ancestry. Additional analysis including variant prioritization and functional annotation, statistical and clinical challenges, and opportunities of mixed ancestry populations in the context of big multi-omics data and racial disparities will be discussed.

Short Bio:Tesfaye Mersha is an associate professor at the Cincinnati Children’s Hospital Medical Center and University of Cincinnati College of Medicine. His research combines quantitative, ancestry and statistical genomics to unravel genetic and non-genetic contributions to complex diseases and racial disparities in human populations, particularly asthma and asthma-related allergic disorders.

Mersha is a recognized expert in the field of genetic ancestry, race, ethnicity, admixture mapping and mining functional genomic databases related to complex diseases. Among his significant contributions, his team developed AncestrySNPminer, the first web-based bioinformatics tool to retrieve ancestry-informative markers from the genomic databases.

His long-term research goal is to understand and dissect how biologic predisposition and environmental exposures interact to shape racial disparities in complex disorders. Some of his recent work including the question: “Do allergy-related readmissions differ by degree of ancestry, and would this association be explained by socio-environmental risk factors rather than direct biologic effects of ancestry on asthma?” In addition, he explores link between COVID-19 pandemic, other health conditions, chronic exposure to air pollution in the context of racial disparities and global variations.

Dr. Mersha received multiple awards and honors, including a Faculty Research Achievement Award from Cincinnati Children’s Hospital Medical Center, Keystone Symposia Early Career Investigator Award and African Professionals Network Business and Professional Achievement Award. His research is continuously funded by the National Institute of Health (NIH).

Date:

Friday, March 4, 2022 - 04:00 pm

Location:

https://uky.zoom.us/j/88266643243?pwd=MTVoRzg4Y1BETTllWlNWOXZZVndSdz09

Event Series:

Statistics Seminar

Read more about Genome-wide association and ancestry analysis in admixed populations

Topological Clustering of Multilayer Networks

Abstract: Multilayer networks continue to gain significant attention in many areas of study, particularly, due to their high utility in modeling interdependent systems such as critical infrastructures, human brain connectome, and socio-environmental ecosystems. However, clustering of multilayer networks, especially, using the information on higher order interactions of the system entities, yet remains in its infancy. We discuss a new topological approach for multilayer network clustering, based on the rationale to group nodes not using the pairwise connectivity patterns or relationships between observations recorded at two individual nodes, but based on how similar in shape their local neighborhoods are at various resolution scales. We quantify shapes of local node neighborhoods using persistence diagrams and then consider either single linkage or k-means forms of topological clustering, which allows us to systematically account for the important heterogeneous higher-order properties of node interactions within and in-between network layers and to integrate information from the node neighbors. In case of topological k-means, we also show that casting it into an empirical risk minimization framework using reproducing kernel Hilbert spaces allows us to derive clustering stability guarantees, similarly to the Euclidean k-means, i.e., property that most existing topological clustering methods lack. We illustrate our topological clustering methods in application to climate-insurance and COVID-19 data.

Date:

Friday, April 8, 2022 - 04:00 pm

Location:

MDS 220

Event Series:

Statistics Seminar

Read more about Topological Clustering of Multilayer Networks

Bayesian Registration of Real-Valued Functions

Abstract: In this talk, I will present a Bayesian framework for registration of real-valued functional data. I will introduce function registration and its statistical setup, as well as a series of transformations, developed under a differential geometric framework, that simply the data and functional parameters. Approximate draws from the posterior distribution are obtained using a novel Markov chain Monte Carlo (MCMC) algorithm which is well suited for estimation of functions. Both simulated and real datasets will be presented to illustrate the proposed approach.

Date:

Friday, April 1, 2022 - 04:00 pm

Location:

MDS 220

Event Series:

Statistics Seminar

Read more about Bayesian Registration of Real-Valued Functions

Robust Sample Weighting to Facilitate Individualized Treatment Rule Learning for a Target Population

Abstract: We consider a setting when a study or source population for individualized-treatment-rule (ITR) learning can differ from the target population of interest. We assume subject covariates are available from both populations, but treatment and outcome data are only available from the source population. Existing methods use "importance" and/or "overlap" weights to adjust for the covariate differences between the two populations. We develop a general weighting framework that allow a better bias-variance trade-off than existing weights. Our method seeks covariate balance over a non-parametric function class characterized by a reproducing kernel Hilbert space. Our weights encompasse the importance weights and overlap weights as special cases. Numerical examples demonstrate that our weights can improve many ITR learning methods for the target population that rely on weighting.

Date:

Friday, March 25, 2022 - 04:00 pm

Location:

Zoom - https://uky.zoom.us/j/82322028704?pwd=RFFOcnFyb1dJdkN2UVRiYUZHWDhGQT09

Event Series:

Statistics Seminar

Read more about Robust Sample Weighting to Facilitate Individualized Treatment Rule Learning for a Target Population

High-Dimensional General Linear Hypothesis Tests via Spectral Shrinkage

Abstract: In statistics, one of the fundamental inferential problems is to test a general linear hypothesis of regression coefficients under a linear model. The framework includes many well-studied problems such as two-sample tests for equality of population means, MANOVA, and others as special cases. The testing problem is well-studied in the classical multivariate analysis literature but remains underexplored under high-dimensional settings. Various classical invariant tests, despite their popularity in multivariate analysis, involve the inverse of the residual covariance matrix, which is inconsistent or even singular when the dimension is at least comparable to the degree of freedom. Consequently, classical tests perform poorly and power enhanced procedures are in need.

In this talk, I seek to regularize the spectrum of the residual covariance matrix by flexible shrinkage functions. A family of rotation-invariant tests is proposed. For illustration purposes, we focus on ridge-type regularization in this talk. The asymptotic normality of the test statistics under the null hypothesis is derived in the setting where dimensionality is comparable to the sample size. The asymptotic power of the proposed test is studied under a class of local alternatives.

Date:

Friday, February 11, 2022 - 03:30 pm

Location:

MDS 220 & https://uky.zoom.us/j/84536644014?pwd=VHhvc1V0U1JWczF6cW1LNTZqMDRxdz09

Event Series:

Statistics Seminar

Read more about High-Dimensional General Linear Hypothesis Tests via Spectral Shrinkage

Mixture representations for likelihood ratio ordered distributions

Abstract: In many statistical applications, subject matter knowledge or theoretical considerations suggest that two distributions should satisfy a stochastic order, with samples from one distribution tending to be larger than those from the other. In these situations, incorporating stochastic order constraints can lead to improved inferences. This talk will introduce mixture representations for distributions satisfying a likelihood ratio order. To illustrate the practical value of the mixture representations, I’ll address the problem of density estimation for likelihood ratio ordered distributions. In particular, I'll propose a nonparametric Bayesian solution which takes advantage of the mixture representations. The prior distribution is constructed from Dirichlet process mixtures and has large support on the space of pairs of densities satisfying the monotone ratio constraint. With a simple modification to the prior distribution, we can also test the equality of two distributions against the alternative of likelihood ratio ordering. I’ll demonstrate the approach in two biomedical applications.

Date:

Monday, February 7, 2022 - 03:30 pm

Location:

https://uky.zoom.us/j/81175174393?pwd=THcvNFF1ZVNvM3NvallhOERndDFOUT09

Event Series:

Statistics Seminar

Read more about Mixture representations for likelihood ratio ordered distributions

Distributional data analysis via quantile functions and its application to modeling digital biomarkers of gait in Alzheimer’s Disease

Abstract: With the advent of continuous health monitoring with wearable devices, users now generate their unique streams of continuous data such as minute-level step counts or heartbeats. Summarizing these streams via scalar summaries often ignores the distributional nature of wearable data and almost unavoidably leads to the loss of critical information. We propose to capture the distributional nature of wearable data via user-specific quantile functions (QF) and use these QFs as predictors in scalar-on-quantile-function-regression (SOQFR). As an alternative approach, we also propose to represent QFs via user-specific L-moments, robust rank-based analogs of traditional moments, and use L-moments as predictors in SOQFR (SOQFR-L). These two approaches provide two mutually consistent interpretations: in terms of quantile levels by SOQFR and in terms of L-moments by SOQFR-L. We also demonstrate how to deal with multi-modal distributional data via Joint and Individual Variation Explained (JIVE) using L-moments. The proposed methods are illustrated in a study of association of digital gait biomarkers with cognitive function in Alzheimer’s disease (AD). Our analysis shows that the proposed methods demonstrate higher predictive performance and attain much stronger associations with clinical cognitive scales compared to simple distributional summaries.

Date:

Wednesday, January 26, 2022 - 02:00 pm

Location:

https://uky.zoom.us/j/89602947520?pwd=VWZqUGhmNmdNa1BSTnVQKzVaSCtTdz09

Event Series:

Statistics Seminar

Read more about Distributional data analysis via quantile functions and its application to modeling digital biomarkers of gait in Alzheimer’s Disease

Bayesian Inverse Reinforcement Learning for Collective Animal Movement

Abstract: The estimation of the spatio-temporal dynamics of animal behavior processes is complicated by nonlinear interactions among individuals and with the environment. Agent-based methods allow for defining simple rules that generate complex group behaviors, but are statistically challenging to estimate and assume behavioral rules are known a priori. Instead of making simplifying assumptions across all anticipated scenarios, inverse reinforcement learning provides inference on the short-term (local) rules governing long term behavior policies or choices by using properties of a Markov decision process. We use the computationally efficient linearly-solvable Markov decision process (LMDP) to learn the local rules governing collective movement. The estimation of the immediate and long-term behavioral decision costs is done in a Bayesian framework. The use of basis function smoothing is used to induce smoothness in the costs across the state space. We demonstrate the advantage of the LMDP for estimating dynamics for a classic collective movement agent-based model, the self propelled particle model. Then, we present the first data application of IRL using the introduced methodology for collective movement of guppies in a tank and estimate trade offs between social and navigational decisions. Lastly, a brief discussion on the connections to traditional resource selection functions in ecology demonstrates the future potential advantage of LMDPs for inference on behavioral decisions as a result of an accumulation of behavioral costs.

Date:

Thursday, January 20, 2022 - 03:30 pm

Location:

https://uky.zoom.us/j/89799450783?pwd=em80U0ZsWm5hbm9rMGxDT2Q5bVh4dz09

Event Series:

Statistics Seminar

Read more about Bayesian Inverse Reinforcement Learning for Collective Animal Movement

Dr. Mai Zhou Retirement Celebration

Please join us as we celebrate the many contributions of Dr. Mai Zhou on the occasion of his retirement from the University of Kentucky. Dr. Zhou joined the Dr. Bing Zhang Department of Statistics in 1989 after completing his degree at Columbia University and spending time in visiting positions at MIT and UNC-CH. Dr. Zhou served as DGS and DUS during his time in the department and he developed an international reputation for his work on empirical likelihood, survival analysis, and semiparametrics. He was an early user and advocator of R. For over 30 years, Dr. Zhou has been the consummate professional and colleague.

The celebration will be very informal. After a brief welcome by Dr. Rayens, Dr. Zhou will reflect on what he considers to be the contributions to research, teaching, and R that he is most proud of, or at least had the most fun with. At the conclusion of these remarks, Dr. Regina Liu, Distinguished Professor of Statistics at Rutgers University, will share some memories. Dr. Liu and Dr. Zhou were doctoral classmates at Columbia. Following Dr. Liu’s remarks, we will open the floor to Dr. Zhou’s former students, many of whom will be in attendance, then to former colleagues and friends.

Date:

Friday, December 3, 2021 - 03:00 pm

Location:

Zoom - Link Coming Soon!

Event Series:

Statistics Seminar

Read more about Dr. Mai Zhou Retirement Celebration