TY - JOUR
T1 - Artificial intelligence for the assessment of bowel preparation
AU - Lee, Ji Young
AU - Calderwood, Audrey H.
AU - Karnes, William
AU - Requa, James
AU - Jacobson, Brian C.
AU - Wallace, Michael B.
N1 - Funding Information:
DISCLOSURE: The following authors disclosed financial relationships: A. H. Calderwood: Advisor for Dark Canyon Labs. W. Karnes: Cofounder and chief medical officer of Docbot. J. Requa: Employee of Docbot. B. C. Jacobson: Consultant for Motus GI. M. B. Wallace: Consultant for Verily; research support from Cosmo and Medtronic. All other authors disclosed no financial relationships.
Publisher Copyright:
© 2022 American Society for Gastrointestinal Endoscopy
PY - 2022/3
Y1 - 2022/3
N2 - Background and Aims: A reliable assessment of bowel preparation is important to ensure high-quality colonoscopy. Current bowel preparation scoring systems are limited by interobserver variability. This study aimed to demonstrate objective assessment of bowel preparation adequacy using an artificial intelligence (AI)/convolutional neural network (CNN) algorithm developed from colonoscopy videos. Methods: Two CNNs were developed using a training set of 73,304 images from 200 colonoscopies. First, a binary CNN was developed and trained to distinguish video frames that were appropriate versus inappropriate for scoring with the Boston Bowel Preparation Scale (BBPS). A second multiclass CNN was developed and trained on 26,950 appropriate frames that were expertly annotated with BBPS segment scores (0-3). We validated the algorithm using 252 10-second video clips that were assigned BBPS segment scores by 2 experts. The algorithm provided mean BBPS scores based on the algorithm (AI-BBPS) by calculating mean BBPS based on each frame's scoring. We maximized the algorithm's performance by choosing a dichotomized AI-BBPS score that closely matched dichotomized BBPS scores (ie, adequate vs inadequate). We tested the mean BBPS score based on the algorithm AI-BBPS against human rating using 30 independent 10-second video clips (test set 1) and 10 full withdrawal colonoscopy videos (test set 2). Results: In the validation set, the algorithm demonstrated an area under the curve of .918 and accuracy of 85.3% for detection of inadequate bowel cleanliness. In test set 1, sensitivity for inadequate bowel preparation was 100% and agreement between raters and AI was 76.7% to 83.3%. In test set 2, sensitivity for inadequate bowel preparation for each segment was 100% and agreement between raters and AI was 68.9% to 89.7%. Agreement between raters alone versus raters and AI were similar (κ = .694 and .649, respectively). Conclusions: The algorithm assessment of bowel cleanliness as measured with the BBPS showed good performance and agreement with experts including full withdrawal colonoscopies.
AB - Background and Aims: A reliable assessment of bowel preparation is important to ensure high-quality colonoscopy. Current bowel preparation scoring systems are limited by interobserver variability. This study aimed to demonstrate objective assessment of bowel preparation adequacy using an artificial intelligence (AI)/convolutional neural network (CNN) algorithm developed from colonoscopy videos. Methods: Two CNNs were developed using a training set of 73,304 images from 200 colonoscopies. First, a binary CNN was developed and trained to distinguish video frames that were appropriate versus inappropriate for scoring with the Boston Bowel Preparation Scale (BBPS). A second multiclass CNN was developed and trained on 26,950 appropriate frames that were expertly annotated with BBPS segment scores (0-3). We validated the algorithm using 252 10-second video clips that were assigned BBPS segment scores by 2 experts. The algorithm provided mean BBPS scores based on the algorithm (AI-BBPS) by calculating mean BBPS based on each frame's scoring. We maximized the algorithm's performance by choosing a dichotomized AI-BBPS score that closely matched dichotomized BBPS scores (ie, adequate vs inadequate). We tested the mean BBPS score based on the algorithm AI-BBPS against human rating using 30 independent 10-second video clips (test set 1) and 10 full withdrawal colonoscopy videos (test set 2). Results: In the validation set, the algorithm demonstrated an area under the curve of .918 and accuracy of 85.3% for detection of inadequate bowel cleanliness. In test set 1, sensitivity for inadequate bowel preparation was 100% and agreement between raters and AI was 76.7% to 83.3%. In test set 2, sensitivity for inadequate bowel preparation for each segment was 100% and agreement between raters and AI was 68.9% to 89.7%. Agreement between raters alone versus raters and AI were similar (κ = .694 and .649, respectively). Conclusions: The algorithm assessment of bowel cleanliness as measured with the BBPS showed good performance and agreement with experts including full withdrawal colonoscopies.
UR - http://www.scopus.com/inward/record.url?scp=85123599331&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123599331&partnerID=8YFLogxK
U2 - 10.1016/j.gie.2021.11.041
DO - 10.1016/j.gie.2021.11.041
M3 - Article
C2 - 34896100
AN - SCOPUS:85123599331
SN - 0016-5107
VL - 95
SP - 512-518.e1
JO - Gastrointestinal endoscopy
JF - Gastrointestinal endoscopy
IS - 3
ER -