MULTIWD: Multi-label wellness dimensions in social media posts

Muskan Garg; Xingyi Liu; M. S.V.P.J. Sathvik; Shaina Raza; Sunghwan Sohn

doi:10.1016/j.jbi.2024.104586

MULTIWD: Multi-label wellness dimensions in social media posts

Muskan Garg, Xingyi Liu, M. S.V.P.J. Sathvik, Shaina Raza, Sunghwan Sohn

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Background: Halbert L. Dunn's concept of wellness is a multi-dimensional aspect encompassing social and mental well-being. Neglecting these dimensions over time can have a negative impact on an individual's mental health. The manual efforts employed in in-person therapy sessions reveal that underlying factors of mental disturbance if triggered, may lead to severe mental health disorders. Objective: In our research, we introduce a fine-grained approach focused on identifying indicators of wellness dimensions and mark their presence in self-narrated human-writings on Reddit social media platform. Design and Method: We present the MULTIWD dataset, a curated collection comprising 3281 instances, as a specifically designed and annotated dataset that facilitates the identification of multiple wellness dimensions in Reddit posts. In our study, we introduce the task of identifying wellness dimensions and utilize state-of-the-art classifiers to solve this multi-label classification task. Results: Our findings highlights the best and comparative performance of fine-tuned large language models with fine-tuned BERT model. As such, we set BERT as a baseline model to tag wellness dimensions in a user-penned text with F1 score of 76.69. Conclusion: Our findings underscore the need of trustworthy and domain-specific knowledge infusion to develop more comprehensive and contextually-aware AI models for tagging and extracting wellness dimensions.

Original language	English (US)
Article number	104586
Journal	Journal of Biomedical Informatics
Volume	150
DOIs	https://doi.org/10.1016/j.jbi.2024.104586
State	Published - Feb 2024

Keywords

Dataset
Mental health
Multi-label classification
Wellness dimensions

ASJC Scopus subject areas

Health Informatics
Computer Science Applications

Access to Document

10.1016/j.jbi.2024.104586

Cite this

@article{406fe09e6b8645f6a564dbb54a4903c8,

title = "MULTIWD: Multi-label wellness dimensions in social media posts",

abstract = "Background: Halbert L. Dunn's concept of wellness is a multi-dimensional aspect encompassing social and mental well-being. Neglecting these dimensions over time can have a negative impact on an individual's mental health. The manual efforts employed in in-person therapy sessions reveal that underlying factors of mental disturbance if triggered, may lead to severe mental health disorders. Objective: In our research, we introduce a fine-grained approach focused on identifying indicators of wellness dimensions and mark their presence in self-narrated human-writings on Reddit social media platform. Design and Method: We present the MULTIWD dataset, a curated collection comprising 3281 instances, as a specifically designed and annotated dataset that facilitates the identification of multiple wellness dimensions in Reddit posts. In our study, we introduce the task of identifying wellness dimensions and utilize state-of-the-art classifiers to solve this multi-label classification task. Results: Our findings highlights the best and comparative performance of fine-tuned large language models with fine-tuned BERT model. As such, we set BERT as a baseline model to tag wellness dimensions in a user-penned text with F1 score of 76.69. Conclusion: Our findings underscore the need of trustworthy and domain-specific knowledge infusion to develop more comprehensive and contextually-aware AI models for tagging and extracting wellness dimensions.",

keywords = "Dataset, Mental health, Multi-label classification, Wellness dimensions",

author = "Muskan Garg and Xingyi Liu and Sathvik, {M. S.V.P.J.} and Shaina Raza and Sunghwan Sohn",

note = "Publisher Copyright: {\textcopyright} 2024",

year = "2024",

month = feb,

doi = "10.1016/j.jbi.2024.104586",

language = "English (US)",

volume = "150",

journal = "Journal of Biomedical Informatics",

issn = "1532-0464",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - MULTIWD

T2 - Multi-label wellness dimensions in social media posts

AU - Garg, Muskan

AU - Liu, Xingyi

AU - Sathvik, M. S.V.P.J.

AU - Raza, Shaina

AU - Sohn, Sunghwan

PY - 2024/2

Y1 - 2024/2

N2 - Background: Halbert L. Dunn's concept of wellness is a multi-dimensional aspect encompassing social and mental well-being. Neglecting these dimensions over time can have a negative impact on an individual's mental health. The manual efforts employed in in-person therapy sessions reveal that underlying factors of mental disturbance if triggered, may lead to severe mental health disorders. Objective: In our research, we introduce a fine-grained approach focused on identifying indicators of wellness dimensions and mark their presence in self-narrated human-writings on Reddit social media platform. Design and Method: We present the MULTIWD dataset, a curated collection comprising 3281 instances, as a specifically designed and annotated dataset that facilitates the identification of multiple wellness dimensions in Reddit posts. In our study, we introduce the task of identifying wellness dimensions and utilize state-of-the-art classifiers to solve this multi-label classification task. Results: Our findings highlights the best and comparative performance of fine-tuned large language models with fine-tuned BERT model. As such, we set BERT as a baseline model to tag wellness dimensions in a user-penned text with F1 score of 76.69. Conclusion: Our findings underscore the need of trustworthy and domain-specific knowledge infusion to develop more comprehensive and contextually-aware AI models for tagging and extracting wellness dimensions.

AB - Background: Halbert L. Dunn's concept of wellness is a multi-dimensional aspect encompassing social and mental well-being. Neglecting these dimensions over time can have a negative impact on an individual's mental health. The manual efforts employed in in-person therapy sessions reveal that underlying factors of mental disturbance if triggered, may lead to severe mental health disorders. Objective: In our research, we introduce a fine-grained approach focused on identifying indicators of wellness dimensions and mark their presence in self-narrated human-writings on Reddit social media platform. Design and Method: We present the MULTIWD dataset, a curated collection comprising 3281 instances, as a specifically designed and annotated dataset that facilitates the identification of multiple wellness dimensions in Reddit posts. In our study, we introduce the task of identifying wellness dimensions and utilize state-of-the-art classifiers to solve this multi-label classification task. Results: Our findings highlights the best and comparative performance of fine-tuned large language models with fine-tuned BERT model. As such, we set BERT as a baseline model to tag wellness dimensions in a user-penned text with F1 score of 76.69. Conclusion: Our findings underscore the need of trustworthy and domain-specific knowledge infusion to develop more comprehensive and contextually-aware AI models for tagging and extracting wellness dimensions.

KW - Dataset

KW - Mental health

KW - Multi-label classification

KW - Wellness dimensions

UR - http://www.scopus.com/inward/record.url?scp=85183330600&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85183330600&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2024.104586

DO - 10.1016/j.jbi.2024.104586

M3 - Article

C2 - 38191011

AN - SCOPUS:85183330600

SN - 1532-0464

VL - 150

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

M1 - 104586

ER -

MULTIWD: Multi-label wellness dimensions in social media posts

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this