Skip to main content

STATISTICS AND BIOSTATISTICS COLLOQUIUM SERIES

Higher-Order Asymptotics for Repeated Measures Analysis under General Conditions

 

Speaker:
Dr. Solomon Harrar
University of Montana, Associate Professor
 
Title:
Higher-Order Asymptotics for Repeated Measures Analysis under General Conditions
 
Abstract:
Repeated measures designs are commonly used in medical, sociological and behavioral researches.  There are several methods for analyzing data arising from such designs, but most of them assume parametric models for the data.  In many cases, the assumed parametric models are  either obviously violated or difficult to verify.
Available nonparametric methods are based on first-order asymptotics.  In this talk, second-order asymptotic results valid under general conditions will be presented.  It will be shown that neither distributional assumptions are needed beyond existence of first few moments, nor are assumptions on the covariance-structure required.
Unavoidable consequence of repeated measures design is the occurrence of missing data.  Extension of the second-order asymptotic results  to the case of missing observations will also be discussed.   Simulation results will be presented to show the finite sample performance of the results. Real data example from a smoking cessation trial will be used to illustrate the application of the results.
 
April 2, 2013
4:00-5:00p.m.
MDS 220
 
Refreshments: 3:30 in MDS 312
Date:
-
Location:
MDS 220

A Coalescent Method to Search for Quantitative Trait Loci in Genome-Wide Association Studies

Abstract:

In human genetics, many quantitative traits, such as blood pressure, are thought to be influenced by particular genes, but are also affected by environmental factors, making the associated genes difficult to identify and locate from genetic data alone.  For this reason, it is difficult to detect and localize single nucleotide polymorphisms (SNPs) associated with quantitative traits in genome-wide association study (GWAS) data using classical statistics. I will present a coalescent approach to search for SNPs associated with quantitative traits in GWAS data by taking into account the evolutionary history among SNPs, and evaluate its performance using simulation data.  Results of applying the statistical methodology developed to a real-data set to search for SNPs associated with high-density lipoprotein cholesterol in mice will also be presented.  By combining methods from stochastic processes and phylogenetics, this work provides an innovative avenue for the development of new statistical methodology in statistical genetics.

February 15, 2013
4:00-5:00p.m.
MDS 220
Refreshments: 3:30 in MDS 312
 
 
Date:
-
Location:
MDS 220

Maximum Entropy Summary Trees

 Abstract: We present a method for summarizing and visualizing large, tree-structured data. Many data sets can be represented by a rooted, node-weighted tree, such as a company organizational chart, clicks on webpages, flows to and from IP addresses, or hard disk file structures, for example, where the weights represent some attribute of interest for each node. If such a tree has thousands (or millions) of nodes, it is difficult to visualize on a single sheet or paper or computer screen. We define a way to aggregate the weights of a large, n-node tree into a smaller k-node “summary tree” (where k is something like 50 or 100), and we present a dynamic programming algorithm to compute the summary tree with maximum entropy among all summary trees of a given size, where the entropy of a node-weighted tree is defined as the entropy of the discrete probability distribution whose probabilities are the normalized node weights. We discuss and provide examples of how this algorithm produces useful visualizations, and may also be optimal for certain kinds of data analysis tasks. The talk will be heavy on visualization techniques, but I will also spend some time discussing statistical issues related to hierarchical data.=

This is joint work with Howard Karloff.
 
Refreshments: 3:30 in MDS 312
Date:
-
Location:
MDS 220

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

 

Title: A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data
 
Speaker: Dr. David Matteson
Assistant  Professor in the Department of Statistical Science at
Cornell University
 
October 19, 2012
4:00-5:00p.m.
MDS 220 
 
Refreshments: 3:30 in MDS 312

 

Abstract:
Change point analysis has applications in a wide variety of fields.
The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data is continually arriving and is analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the pth absolute moment, for some p in (0,2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms.
The divisive method is shown to provide consistent estimates of both the number and location of change points under standard regularity assumptions. We compare the proposed approach with competing methods in a simulation study. Methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs. Applications in finance, genetics and spatio-temporal analysis are presented. We conclude with a discussion of future work.
Date:
-
Location:
MDS 220
Subscribe to STATISTICS AND BIOSTATISTICS COLLOQUIUM SERIES