A Semantically Enhance Approach to Identify Depression Indicative Symptoms Using Twitter Data

TitleA Semantically Enhance Approach to Identify Depression Indicative Symptoms Using Twitter Data
Publication TypeThesis
Year of Publication2018
AuthorsAnkita Saxena
Academic DepartmentDepartment of Engineering & Computer Science
Date Published05/2018
UniversityWright State University
Thesis TypeM.S. Thesis

According to the World Health Organization, more than 300 million people suffer from Major Depressive Disorder (MDD) worldwide. PHQ-9 is used to screen and diagnose MDD clinically and identify its severity. With the unprecedented growth and enthusiastic
acceptance of social media such as Twitter, a large number of people have come to share their feelings and emotions on it openly. Each tweet can indicate a user’s opinion, thought or feeling. A tweet can also indicate multiple symptoms related to PHQ-9. Identifying PHQ-9 symptoms indicated by a tweet can provide crucial information about a user regarding his/her depression diagnosis. The current state-of-the-art approach using supervised machine learning to classify a tweet regarding PHQ-9 symptoms relies on explicit reference to a particular PHQ-9 symptom, i.e., it considers an exact string matching-based feature representation. This approach of explicit referencing falls short on classifying tweets having an implicit symptom indicator in several possible PHQ-9 symptoms. This thesis proposes a semantically enhanced approach that considers explicit as well as implicit depression-indicative symptoms. We better capture the semantics of a word in a tweet as it relates to depression condition by employing the context of the word indicated by the surrounding words using Word2Vec model trained on a corpus of ~3 million tweets. Using a two-stage (binary class - multi-label) classification model, we demonstrate that our approach outperforms the baseline model for depression-indicative symptoms by around 20% on f-measure. We further evaluated our semantically-enhanced approach to fill in the PHQ-9 questionnaire and identify the severity of depression by standard guidelines by considering a dataset of 932,108 self-reported users.

Full Text

SAXENA, ANKITA. M.S., Department of Computer Science and Engineering, Wright State University, 2018. A Semantically Enhanced Approach to Identify Depression-Indicative Symptoms Using Twitter Data.