Abstract
Binge drinking is a severe health problem faced by many US colleges and universities. College students often post drinking-related text and images on social media, portraying their alcohol use as socially desirable. In this project, we investigated the feasibility of mining the heterogeneous data (e.g. text, images, and videos) on Facebook to identify drinking-related contents. We manually annotated 4266 posts during 21 October 2011 and 3 November 2014 from “I’m Shmacked” group on Facebook, where 511 posts were drinking-related. Our machine learning models show that by combining heterogeneous data types, we were able to identify drinking-related posts with an F1-score of 0.81. Prediction models built on text data were more reliable compared to those built on image and video data for predicting drinking-related contents. As the first step of our efforts in this direction, this feasibility study showed promise toward unleashing the potential of mining social media to identify students who binge drink.
Original language | English (US) |
---|---|
Pages (from-to) | 1756-1767 |
Number of pages | 12 |
Journal | Health Informatics Journal |
Volume | 25 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2019 |
Keywords
- binge drinking
- image classification
- machine learning
- social media
- text mining
- video classification
ASJC Scopus subject areas
- Health Informatics