Comparing deep learning-based automatic segmentation of breast masses to expert interobserver variability in ultrasound imaging

Jeremy M. Webb, Shaheeda A. Adusei, Yinong Wang, Naziya Samreen, Kalie Adler, Duane D. Meixner, Robert T. Fazzio, Mostafa Fatemi, Azra Alizad

Research output: Contribution to journalArticlepeer-review


Deep learning is a powerful tool that became practical in 2008, harnessing the power of Graphic Processing Unites, and has developed rapidly in image, video, and natural language processing. There are ongoing developments in the application of deep learning to medical data for a variety of tasks across multiple imaging modalities. The reliability and repeatability of deep learning techniques are of utmost importance if deep learning can be considered a tool for assisting experts, including physicians, radiologists, and sonographers. Owing to the high costs of labeling data, deep learning models are often evaluated against one expert, and it is unknown if any errors fall within a clinically acceptable range. Ultrasound is a commonly used imaging modality for breast cancer screening processes and for visually estimating risk using the Breast Imaging Reporting and Data System score. This process is highly dependent on the skills and experience of the sonographers and radiologists, thereby leading to interobserver variability and interpretation. For these reasons, we propose an interobserver reliability study comparing the performance of a current top-performing deep learning segmentation model against three experts who manually segmented suspicious breast lesions in clinical ultrasound (US) images. We pretrained the model using a US thyroid segmentation dataset with 455 patients and 50,993 images, and trained the model using a US breast segmentation dataset with 733 patients and 29,884 images. We found a mean Fleiss kappa value of 0.78 for the performance of three experts in breast mass segmentation compared to a mean Fleiss kappa value of 0.79 for the performance of experts and the optimized deep learning model.

Original languageEnglish (US)
Article number104966
JournalComputers in Biology and Medicine
StatePublished - Dec 2021


  • Automatic segmentation
  • Breast cancer
  • Deep leaning
  • Interobserver variability
  • Ultrasound

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics


Dive into the research topics of 'Comparing deep learning-based automatic segmentation of breast masses to expert interobserver variability in ultrasound imaging'. Together they form a unique fingerprint.

Cite this