Avoiding Blunders When Analyzing Correlated Data, Clustered Data, or Repeated Measures

Yu Hui H. Chang, Matthew R. Buras, John M. Davis, Cynthia S. Crowson

Research output: Contribution to journalReview articlepeer-review

Abstract

Rheumatology research often involves correlated and clustered data. A common error when analyzing these data occurs when instead we treat these data as independent observations. This can lead to incorrect statistical inference. The data used are a subset of the 2017 study from Raheel et al consisting of 633 patients with rheumatoid arthritis (RA) between 1988 and 2007. RA flare and the number of swollen joints served as our binary and continuous outcomes, respectively. Generalized linear models (GLM) were fitted for each, while adjusting for rheumatoid factor (RF) positivity and sex. Additionally, a generalized linear mixed model with a random intercept and a generalized estimating equation were used to model RA flare and the number of swollen joints, respectively, to take additional correlation into account. The GLM’s β coefficients and their 95% confidence intervals (CIs) are then compared to their mixed-effects equivalents. The β coefficients compared between methodologies are very similar. However, their standard errors increase when correlation is accounted for. As a result, if the additional correlations are not considered, the standard error can be underestimated. This results in an overestimated effect size, narrower CIs, increased type I error, and a smaller P value, thus potentially producing misleading results. It is important to model the additional correlation that occurs in correlated data.

Original languageEnglish (US)
Pages (from-to)1269-1272
Number of pages4
JournalJournal of Rheumatology
Volume50
Issue number10
DOIs
StatePublished - Oct 1 2023

Keywords

  • clustered data
  • correlated data
  • mixed model
  • repeated measures
  • statistical software

ASJC Scopus subject areas

  • Rheumatology
  • Immunology and Allergy
  • Immunology

Fingerprint

Dive into the research topics of 'Avoiding Blunders When Analyzing Correlated Data, Clustered Data, or Repeated Measures'. Together they form a unique fingerprint.

Cite this