Big Data and Big Challenges. A Lesson in Formulating and Solving Problems Statistically
Additional Details Coming Soon!
Additional Details Coming Soon!
Abstract: The class of bivariate integer-valued time series models is gaining rapid popularity. However, its efficiency and adaptability are being challenged because of zero-inflation of count time series (ZITS) and algorithm techniques. In this presentation, the bivariate copula is presented with ZITS. The computational algorithm is proposed via copula theory. Each series follows a Markov chain with the serial dependence captured using copula-based transition probability functions with Poisson and zero-inflated Poisson margins. The copula theory is also used to capture bivariate ZITS where the dependence between the two series using the bivariate Gaussian and t-copula functions. Likelihood based inference is used to estimate the models’ parameters for simulated and real data with the bivariate integrals of the Gaussian and t-copula functions being evaluated using standard randomized Monte Carlo methods.
Abstract: We consider a semiparametric mixture of two density functions where one of them is known while the weight and the other function are unknown. We do not assume any additional structure on the unknown density function. We suggest a novel approach to estimation of this model that is based on an idea of applying a maximum smoothed likelihood to what would otherwise have been an ill-posed problem. We introduce an iterative MM (Majorization-Minimization) algorithm that estimates all of the model parameters. Unlike possible competing methods, this algorithm works well in both univariate as well as multivariate case. We establish that the algorithm possesses a descent property with respect to a log-likelihood objective functional and prove that the algorithm, indeed, converges. Finally, we also illustrate the performance of our algorithm in a simulation study and using a real dataset.
Abstract: This talk is about Pharmacokinetic models, algorithms used to analyze data from them, and a program I am writing to make the analysis easy. There will be a tutorial on Pharmacokinetic models followed by a short description of my method for solving differential equations. We will take a look at the Tetracycline data and then analyze it using my program.
Abstract: This talk will provide an overview of three major areas in my research agenda: finite mixture models, tolerance regions, and zero-inflated models. Two projects from each area will be highlighted, along with a brief discussion of theoretical or methodological advancements produced during the research, and the data that were analyzed. I will provide some commentary about my two R packages, mixtools and tolerance. Research undertaken by former and current PhD advisees will also be highlighted.
Details coming soon!
Abstract: Disclosure avoidance techniques are used by agencies to prepare releases of statistics and microdata when internal data contain information considered sensitive to individual subjects. Differential privacy (DP) techniques have become popular in the literature and are finding increasing use in practical applications. One fundamental DP technique to protect sensitive data is to add noise from a selected distribution in such a way that mathematical privacy criteria are satisfied. An analyst making use of such data in a statistical model may wish to account for uncertainty introduced by the added noise. This work considers Bayesian regression models which regard the agency noise - or equivalently, the unreleased sensitive data - as augmented data. Given other random variables in the model, conditional distributions of these augmented data form weighted densities, but a method of drawing from them may not be apparent. We revisit the direct sampling method proposed by Walker et al. (JCGS 2011) and explore several customizations to address issues encountered in the basic version of the algorithm. Draws from the desired conditional distributions may be then taken reliably, largely avoiding the need for rejections or manual tuning. The customized direct sampler is used to complete the specification of a Gibbs sampler to fit a Lognormal regression model where agency noise has been added to both the outcome and some of the covariates. Demonstrations compare inference using the sensitive internal data versus the privacy-protected release.