Joint Dynamic Online Social Network Analytics Using Network, Content and User Characteristics

TitleJoint Dynamic Online Social Network Analytics Using Network, Content and User Characteristics
Publication TypeThesis
Year of Publication2015
AuthorsYiye Ruan
Academic DepartmentComputer Science and Engineering
Number of Pages206
Date Published05/2015
UniversityOhio State University
Keywordscommunity detection, Data Mining, Graph mining, online social networks, Sentiment Analysis, structural role detection

Online social networks (OSNs) allow Internet users all over the globe to share information, exchange thoughts, and work collaboratively. Not only do OSNs provide a channel of broadcasting real-world events as they unfold, they also enable a convenient way for users to exchange experience and opinions. Understanding the relation among network topology, users, content, and their dynamics can have a significant impact both from a theoretical standpoint as well as from a practical one, for instance, to understand online user behaviors and predict future online activities.

In this dissertation, I study the interplay of three important factors that encode most of the OSN dynamics: network structure, user-generated content, and user characteristics. We first present our broader contribution to computer science: the development of two novel graph algorithms for community detection and structural role detection, which are scalable to handle networks containing millions of nodes and edges. Both community and role assignments of nodes generate novel clusterings of OSN users and provide valuable insights into OSN activities, but they are often implicit or even unknown to OSN analysts. We bridge this chasm by designing algorithms that can automatically infer community and role information in large-scale OSN data. Our algorithms are (1) robust in the presence of noise in real-world data, and (2) efficient in processing large network datasets. A key element to both of these contributions is a practical approach for network sparsification which enables efficient processing. Evaluated on various social networks containing hundreds of millions of edges, our algorithms outperform state-of-the-art approaches in terms of the ability of recovering ground truth communities and roles of OSN users. By augmenting the network structure with content information and performing joint inference, our algorithms are able to combat the impact of noise. At the same time, careful design and optimization of our algorithms render them highly efficient when compared with existing approaches, and even non-trivial speedups on some networks.

Then we investigate three analytical tasks on OSN activities from the perspective of a user: (1) predicting user engagement in online discussion, (2) understanding the divergence of user-generated content, and (3) identifying patterns in the shift of user sentiment over time. Underpinning this effort are scalable mechanisms to infer important topological characteristics of such networks including community affiliation and structural roles, as discussed above. Experiments with large-scale datasets constructed from real OSNs show that our approaches, which incorporate information on network, content, and users, have demonstrated significant improvements over existing work which only focuses on one single aspect. More importantly, the findings from our studies on large-scale OSN data often reflect similar phenomena observed in social networks in the traditional face-to-face setting, making it promising to apply these quantitative approaches in the analysis of a broader spectrum of social networks.