According to the World Health Organization, more than 300 million people suffer from Major Depressive Disorder (MDD) worldwide. PHQ-9 is used to screen and diagnose MDD clinically and identify its severity. With the unprecedented growth and enthusiastic
acceptance of social media such as Twitter, a large number of people have come to share their feelings and emotions on it openly. Each tweet can indicate a user’s opinion, thought or feeling. A tweet can also indicate multiple symptoms related to PHQ-9. Identifying PHQ-9 symptoms indicated by a tweet can provide crucial information about a user regarding his/her depression diagnosis. The current state-of-the-art approach using supervised machine learning to classify a tweet regarding PHQ-9 symptoms relies on explicit reference to a particular PHQ-9 symptom, i.e., it considers an exact string matching-based feature representation. This approach of explicit referencing falls short on classifying tweets having an implicit symptom indicator in several possible PHQ-9 symptoms. This thesis proposes a semantically enhanced approach that considers explicit as well as implicit depression-indicative symptoms. We better capture the semantics of a word in a tweet as it relates to depression condition by employing the context of the word indicated by the surrounding words using Word2Vec model trained on a corpus of ~3 million tweets. Using a two-stage (binary class - multi-label) classification model, we demonstrate that our approach outperforms the baseline model for depression-indicative symptoms by around 20% on f-measure. We further evaluated our semantically-enhanced approach to fill in the PHQ-9 questionnaire and identify the severity of depression by standard guidelines by considering a dataset of 932,108 self-reported users.