Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework

Sungrim Moon; Huan He; Jungwei W. Fan

doi:10.1109/ICHI54592.2022.00017

Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework

Sungrim Moon, Huan He, Jungwei W. Fan

Digital Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Transformers-based language models have achieved impressive performance in biomedical question-answering (QA). Our previous work led to surmise that such models could leverage frequent literal question-answer pairs to get the correct answers, casting doubt on true intelligence and transferability. Therefore, we conducted experiments by masking the anchor concept in the question and context documents during the fine-tuning stage of BERT for a reading comprehension QA task on clinical notes. The perturbation involved randomly replacing 0%, 10%, 20%, 30%, and 100% of the concept occurrences into a dummy string. We found the 100% masking harshly penalized the overall accuracy by about 0.10 versus 0% masking. However, the accuracy improved about 0.01 to 0.02 at 20% masking - and the benefit was able to transfer when tested on a different corpus. We also found the masking preferably enhanced the accuracy for question-answer pairs of the top 20%-40% frequent in the train set. The results suggested that transformers-based QA systems may benefit from moderate masking during fine-tuning, likely by forcing the model to learn abstract context patterns rather than relying on specific surface terms or relations. The beneficial effect skewed toward a specific non-top frequency tier could reflect a more general phenomenon in machine learning where such enhancement techniques are most effective for cases that sit around the make-or-fail border.

Original language	English (US)
Title of host publication	Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	36-41
Number of pages	6
ISBN (Electronic)	9781665468459
DOIs	https://doi.org/10.1109/ICHI54592.2022.00017
State	Published - 2022
Event	10th IEEE International Conference on Healthcare Informatics, ICHI 2022 - Rochester, United States Duration: Jun 11 2022 → Jun 14 2022

Publication series

Name	Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022

Conference

Conference	10th IEEE International Conference on Healthcare Informatics, ICHI 2022
Country/Territory	United States
City	Rochester
Period	6/11/22 → 6/14/22

Keywords

Deep Learning
Electronic Health Records
Natural Language Processing
Question Answering
Supervised Machine Learning

ASJC Scopus subject areas

Artificial Intelligence
Computer Science Applications
Information Systems and Management
Safety, Risk, Reliability and Quality
Health Informatics

Access to Document

10.1109/ICHI54592.2022.00017

Cite this

Moon, S., He, H., & Fan, J. W. (2022). Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework. In Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022 (pp. 36-41). (Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICHI54592.2022.00017

Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework. / Moon, Sungrim; He, Huan; Fan, Jungwei W.
Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022. Institute of Electrical and Electronics Engineers Inc., 2022. p. 36-41 (Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Moon, S, He, H & Fan, JW 2022, Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework. in Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022. Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022, Institute of Electrical and Electronics Engineers Inc., pp. 36-41, 10th IEEE International Conference on Healthcare Informatics, ICHI 2022, Rochester, United States, 6/11/22. https://doi.org/10.1109/ICHI54592.2022.00017

Moon S, He H, Fan JW. Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework. In Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022. Institute of Electrical and Electronics Engineers Inc. 2022. p. 36-41. (Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022). doi: 10.1109/ICHI54592.2022.00017

Moon, Sungrim ; He, Huan ; Fan, Jungwei W. / Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework. Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022. Institute of Electrical and Electronics Engineers Inc., 2022. pp. 36-41 (Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022).

@inproceedings{32fd51bf2b8d401fa26535a46d52c1ad,

title = "Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework",

abstract = "Transformers-based language models have achieved impressive performance in biomedical question-answering (QA). Our previous work led to surmise that such models could leverage frequent literal question-answer pairs to get the correct answers, casting doubt on true intelligence and transferability. Therefore, we conducted experiments by masking the anchor concept in the question and context documents during the fine-tuning stage of BERT for a reading comprehension QA task on clinical notes. The perturbation involved randomly replacing 0%, 10%, 20%, 30%, and 100% of the concept occurrences into a dummy string. We found the 100% masking harshly penalized the overall accuracy by about 0.10 versus 0% masking. However, the accuracy improved about 0.01 to 0.02 at 20% masking - and the benefit was able to transfer when tested on a different corpus. We also found the masking preferably enhanced the accuracy for question-answer pairs of the top 20%-40% frequent in the train set. The results suggested that transformers-based QA systems may benefit from moderate masking during fine-tuning, likely by forcing the model to learn abstract context patterns rather than relying on specific surface terms or relations. The beneficial effect skewed toward a specific non-top frequency tier could reflect a more general phenomenon in machine learning where such enhancement techniques are most effective for cases that sit around the make-or-fail border.",

keywords = "Deep Learning, Electronic Health Records, Natural Language Processing, Question Answering, Supervised Machine Learning",

author = "Sungrim Moon and Huan He and Fan, {Jungwei W.}",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 10th IEEE International Conference on Healthcare Informatics, ICHI 2022 ; Conference date: 11-06-2022 Through 14-06-2022",

year = "2022",

doi = "10.1109/ICHI54592.2022.00017",

language = "English (US)",

series = "Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "36--41",

booktitle = "Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022",

}

TY - GEN

T1 - Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework

AU - Moon, Sungrim

AU - He, Huan

AU - Fan, Jungwei W.

PY - 2022

Y1 - 2022

N2 - Transformers-based language models have achieved impressive performance in biomedical question-answering (QA). Our previous work led to surmise that such models could leverage frequent literal question-answer pairs to get the correct answers, casting doubt on true intelligence and transferability. Therefore, we conducted experiments by masking the anchor concept in the question and context documents during the fine-tuning stage of BERT for a reading comprehension QA task on clinical notes. The perturbation involved randomly replacing 0%, 10%, 20%, 30%, and 100% of the concept occurrences into a dummy string. We found the 100% masking harshly penalized the overall accuracy by about 0.10 versus 0% masking. However, the accuracy improved about 0.01 to 0.02 at 20% masking - and the benefit was able to transfer when tested on a different corpus. We also found the masking preferably enhanced the accuracy for question-answer pairs of the top 20%-40% frequent in the train set. The results suggested that transformers-based QA systems may benefit from moderate masking during fine-tuning, likely by forcing the model to learn abstract context patterns rather than relying on specific surface terms or relations. The beneficial effect skewed toward a specific non-top frequency tier could reflect a more general phenomenon in machine learning where such enhancement techniques are most effective for cases that sit around the make-or-fail border.

AB - Transformers-based language models have achieved impressive performance in biomedical question-answering (QA). Our previous work led to surmise that such models could leverage frequent literal question-answer pairs to get the correct answers, casting doubt on true intelligence and transferability. Therefore, we conducted experiments by masking the anchor concept in the question and context documents during the fine-tuning stage of BERT for a reading comprehension QA task on clinical notes. The perturbation involved randomly replacing 0%, 10%, 20%, 30%, and 100% of the concept occurrences into a dummy string. We found the 100% masking harshly penalized the overall accuracy by about 0.10 versus 0% masking. However, the accuracy improved about 0.01 to 0.02 at 20% masking - and the benefit was able to transfer when tested on a different corpus. We also found the masking preferably enhanced the accuracy for question-answer pairs of the top 20%-40% frequent in the train set. The results suggested that transformers-based QA systems may benefit from moderate masking during fine-tuning, likely by forcing the model to learn abstract context patterns rather than relying on specific surface terms or relations. The beneficial effect skewed toward a specific non-top frequency tier could reflect a more general phenomenon in machine learning where such enhancement techniques are most effective for cases that sit around the make-or-fail border.

KW - Deep Learning

KW - Electronic Health Records

KW - Natural Language Processing

KW - Question Answering

KW - Supervised Machine Learning

UR - http://www.scopus.com/inward/record.url?scp=85139007662&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85139007662&partnerID=8YFLogxK

U2 - 10.1109/ICHI54592.2022.00017

DO - 10.1109/ICHI54592.2022.00017

M3 - Conference contribution

AN - SCOPUS:85139007662

T3 - Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022

SP - 36

EP - 41

BT - Proceedings - 2022 IEEE 10th International Conference on Healthcare Informatics, ICHI 2022

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 10th IEEE International Conference on Healthcare Informatics, ICHI 2022

Y2 - 11 June 2022 through 14 June 2022

ER -

Effects of Information Masking in the Task-Specific Finetuning of a Transformers-Based Clinical Question-Answering Framework

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this