Segmentation errors and intertest reliability in automated and manually traced hippocampal volumes

Benjamin H. Brinkmann, Hari Guragain, Daniel Kenney-Jung, Jay Mandrekar, Robert E. Watson, Kirk M. Welker, Jeffrey W. Britton, Robert J. Witte

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Objective: To rigorously compare automated atlas-based and manual tracing hippocampal segmentation for accuracy, repeatability, and clinical acceptability given a relevant range of imaging abnormalities in clinical epilepsy. Methods: Forty-nine patients with hippocampal asymmetry were identified from our institutional radiology database, including two patients with significant anatomic deformations. Manual hippocampal tracing was performed by experienced technologists on 3T MPRAGE images, measuring hippocampal volume up to the tectal plate, excluding the hippocampal tail. The same images were processed using NeuroQuant and FreeSurfer software. Ten subjects underwent repeated manual hippocampal tracings by two additional technologists blinded to previous results to evaluate consistency. Ten patients with two clinical MRI studies had volume measurements repeated using NeuroQuant and FreeSurfer. Results: FreeSurfer raw volumes were significantly lower than NeuroQuant (P ' 0.001, right and left), and hippocampal asymmetry estimates were lower for both automatic methods than manual tracing (P ' 0.0001). Differences remained significant after scaling volumes to age, gender, and scanner matched normative percentiles. Volume reproducibility was fair (0.4–0.59) for manual tracing, and excellent ('0.75) for both automated methods. Asymmetry index reproducibility was excellent ('0.75) for manual tracing and FreeSurfer segmentation and fair (0.4–0.59) for NeuroQuant segmentation. Both automatic segmentation methods failed on the two cases with anatomic deformations. Segmentation errors were visually identified in 25 NeuroQuant and 27 FreeSurfer segmentations, and nine (18%) NeuroQuant and six (12%) FreeSurfer errors were judged clinically significant. Interpretation: Automated hippocampal volumes are more reproducible than hand-traced hippocampal volumes. However, these methods fail in some cases, and significant segmentation errors can occur.

Original languageEnglish (US)
Pages (from-to)1807-1814
Number of pages8
JournalAnnals of Clinical and Translational Neurology
Issue number9
StatePublished - Sep 1 2019

ASJC Scopus subject areas

  • General Neuroscience
  • Clinical Neurology


Dive into the research topics of 'Segmentation errors and intertest reliability in automated and manually traced hippocampal volumes'. Together they form a unique fingerprint.

Cite this