Complete imputation of missing repeated categorical data: One-sample applications

Colin P. West, Jeffrey D. Dawson

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Longitudinal studies with repeated measures are often subject to non-response. Methods currently employed to alleviate the difficulties caused by missing data are typically unsatisfactory, especially when the cause of the missingness is related to the outcomes. We present an approach for incomplete categorical data in the repeated measures setting that allows missing data to depend on other observed outcomes for a study subject. The proposed methodology also allows a broader examination of study findings through interpretation of results in the framework of the set of all possible test statistics that might have been observed had no data been missing. The proposed approach consists of the following general steps. First, we generate all possible sets of missing values and form a set of possible complete data sets. We then weight each data set according to clearly defined assumptions and apply an appropriate statistical test procedure to each data set, combining the results to give an overall indication of significance. We make use of the EM algorithm and a Bayesian prior in this approach. While not restricted to the one-sample case, the proposed methodology is illustrated for one-sample data and compared to the common complete-case and available-case analysis methods.

Original languageEnglish (US)
Pages (from-to)203-217
Number of pages15
JournalStatistics in Medicine
Issue number2
StatePublished - Jan 30 2002


  • EM algorithm
  • Incomplete categorical data
  • Missing data in longitudinal studies
  • Pattern of missingness

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'Complete imputation of missing repeated categorical data: One-sample applications'. Together they form a unique fingerprint.

Cite this