Abstract:
In the era of social media, the analysis of Twitter data has become increasingly important for understanding the dynamics of online discourse. This research introduces a novel approach for tracking the spatial and temporal evolution of topics in Twitter data. Leveraging the spatial and temporal labels provided by Twitter for tweets, we propose the Clustered Biterm Topic Model. This model combines the Biterm Topic Model with K-medoid clustering to uncover the intricate topic development patterns over space and time. To enhance the accuracy and applicability of our model, we introduce an innovative element: a covariate-dependent matrix. This matrix incorporates essential covariate information and geographic proximity into the dissimilarity matrix used by K-Medoids clustering. By considering the inherent semantic relationships between topics and the contextual information provided by covariates and geographic proximity, our model captures the complex interplay of topics as they emerge and evolve across different regions and timeframes on Twitter. The proposed Clustered Biterm Topic Model offers a robust and versatile tool for researchers, policymakers, and businesses to gain deeper insights into the dynamic landscape of online conversations, which are inherently shaped by space and time.