Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Chandreen Liyanage, Muskan Garg, Vijay Mago, Sunghwan Sohn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD)1 manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the prescreening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew’s Correlation Coefficient for upto 13.11% and 15.95%, respectively.

Original languageEnglish (US)
Title of host publicationBioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop
EditorsDina Demner-fushman, Sophia Ananiadou, Kevin Cohen
PublisherAssociation for Computational Linguistics (ACL)
Pages306-312
Number of pages7
ISBN (Electronic)9781959429852
StatePublished - 2023
Event22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023 - Toronto, Canada
Duration: Jul 13 2023 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023
Country/TerritoryCanada
CityToronto
Period7/13/23 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health'. Together they form a unique fingerprint.

Cite this