Skip to main content

Direct Sampling in Bayesian Regression Models with Additive Disclosure Avoidance Noise

Date:
Location:
https://uky.zoom.us/j/81507661849
Speaker(s) / Presenter(s):
Dr. Andrew Raim, U.S. Census Bureau

 

Abstract: Disclosure avoidance techniques are used by agencies to prepare releases of statistics and microdata when internal data contain information considered sensitive to individual subjects. Differential privacy (DP) techniques have become popular in the literature and are finding increasing use in practical applications. One fundamental DP technique to protect sensitive data is to add noise from a selected distribution in such a way that mathematical privacy criteria are satisfied. An analyst making use of such data in a statistical model may wish to account for uncertainty introduced by the added noise. This work considers Bayesian regression models which regard the agency noise - or equivalently, the unreleased sensitive data - as augmented data. Given other random variables in the model, conditional distributions of these augmented data form weighted densities, but a method of drawing from them may not be apparent. We revisit the direct sampling method proposed by Walker et al. (JCGS 2011) and explore several customizations to address issues encountered in the basic version of the algorithm. Draws from the desired conditional distributions may be then taken reliably, largely avoiding the need for rejections or manual tuning. The customized direct sampler is used to complete the specification of a Gibbs sampler to fit a Lognormal regression model where agency noise has been added to both the outcome and some of the covariates. Demonstrations compare inference using the sensitive internal data versus the privacy-protected release.

 

Event Series: