Probability & Statistics Day 2014 Group Photo
PROBABILITY & STATISTICS DAY
Funded By: National Security Agency | Hosted By: Center for Interdisciplinary Research and Consulting
9th Annual April 17-18, 2015

Register A special feature of Probability and Statistics Day at UMBC 2015 is that the conference, including the workshop, is open to all statistics graduate students from UMBC and local universites free of charge; however, REGISTRATION IS REQUIRED! The deadline to register is Friday, April 3, 2015.   // REGISTER NOW

For more information, contact any member of the organizing committee:

Bimal Sinha
Conference Chair
443.538.3012

Kofi Adragni
  410.455.2406
Yvonne Huang
  410.455.2422
Yaakov Malinovsky
  410.455.2968
Thomas Mathew
  410.455.2418
Nagaraj Neerchal
  410.455.2437
DoHwan Park
  410.455.2408
Junyong Park
  410.455.2407
Anindya Roy
  410.455.2435
Elizabeth Stanwyck
  410.455.5731

Sponsor

Participant Information

April Albertine

Paper: Comparison of several types of confidence intervals based on multiply imputed synthetic data under a normal model

The release of synthetic data sets is one approach for protecting confidentiality when the release of the original survey microdata is impossible due to privacy concerns. The goal is to preserve important summary features (thereby allowing outside researchers the opportunity for analysis) while disguising individual distinguishing responses that could be used to identify the subjects themselves. A single synthetic data set can be made public (single imputation), or even multiple synthetic data sets generated from the same raw data (multiple imputation). We consider multiply imputed synthetic data where each set is imputed via posterior predictive sampling, assuming that the original data is normally distributed. We evaluate several different methods of combining the summary statistics of the synthetic data sets in order to do inference on the parameters of the underlying normal model. The different methods of combining the synthetic data sets are compared according to the expected length of confidence intervals for the two parameters of the normal model. We present an application using data on household income from the United States Current Population Survey.