Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Chandreen Liyanage; Muskan Garg; Vijay Mago; Sunghwan Sohn

Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Chandreen Liyanage, Muskan Garg, Vijay Mago, Sunghwan Sohn

Quantitative Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD)¹ manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the prescreening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew’s Correlation Coefficient for upto 13.11% and 15.95%, respectively.

Original language	English (US)
Title of host publication	BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop
Editors	Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Publisher	Association for Computational Linguistics (ACL)
Pages	306-312
Number of pages	7
ISBN (Electronic)	9781959429852
State	Published - 2023
Event	22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023 - Toronto, Canada Duration: Jul 13 2023 → …

Publication series

Name	Proceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)	0736-587X

Conference

Conference	22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023
Country/Territory	Canada
City	Toronto
Period	7/13/23 → …

ASJC Scopus subject areas

Computer Science Applications
Linguistics and Language
Language and Linguistics

Cite this

Liyanage, C., Garg, M., Mago, V., & Sohn, S. (2023). Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. In D. Demner-fushman, S. Ananiadou, & K. Cohen (Eds.), BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop (pp. 306-312). (Proceedings of the Annual Meeting of the Association for Computational Linguistics). Association for Computational Linguistics (ACL).

Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. / Liyanage, Chandreen; Garg, Muskan; Mago, Vijay et al.
BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop. ed. / Dina Demner-fushman; Sophia Ananiadou; Kevin Cohen. Association for Computational Linguistics (ACL), 2023. p. 306-312 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Liyanage, C, Garg, M, Mago, V & Sohn, S 2023, Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. in D Demner-fushman, S Ananiadou & K Cohen (eds), BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop. Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), pp. 306-312, 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023, Toronto, Canada, 7/13/23.

Liyanage C, Garg M, Mago V, Sohn S. Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. In Demner-fushman D, Ananiadou S, Cohen K, editors, BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop. Association for Computational Linguistics (ACL). 2023. p. 306-312. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

Liyanage, Chandreen ; Garg, Muskan ; Mago, Vijay et al. / Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop. editor / Dina Demner-fushman ; Sophia Ananiadou ; Kevin Cohen. Association for Computational Linguistics (ACL), 2023. pp. 306-312 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

@inproceedings{96ea53991dec49aab6c61144aec3203a,

title = "Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health",

abstract = "Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD)1 manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the prescreening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew{\textquoteright}s Correlation Coefficient for upto 13.11% and 15.95%, respectively.",

author = "Chandreen Liyanage and Muskan Garg and Vijay Mago and Sunghwan Sohn",

note = "Publisher Copyright: {\textcopyright} 2023 Association for Computational Linguistics.; 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023 ; Conference date: 13-07-2023",

year = "2023",

language = "English (US)",

series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

pages = "306--312",

editor = "Dina Demner-fushman and Sophia Ananiadou and Kevin Cohen",

booktitle = "BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop",

}

TY - GEN

T1 - Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

AU - Liyanage, Chandreen

AU - Garg, Muskan

AU - Mago, Vijay

AU - Sohn, Sunghwan

PY - 2023

Y1 - 2023

N2 - Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD)1 manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the prescreening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew’s Correlation Coefficient for upto 13.11% and 15.95%, respectively.

AB - Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD)1 manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the prescreening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew’s Correlation Coefficient for upto 13.11% and 15.95%, respectively.

UR - http://www.scopus.com/inward/record.url?scp=85174509749&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85174509749&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85174509749

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 306

EP - 312

BT - BioNLP 2023 - BioNLP and BioNLP-ST, Proceedings of the Workshop

A2 - Demner-fushman, Dina

A2 - Ananiadou, Sophia

A2 - Cohen, Kevin

PB - Association for Computational Linguistics (ACL)

T2 - 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, BioNLP 2023

Y2 - 13 July 2023

ER -

Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this