Main article

Michael Anderson
Department of Computer Science, Stanford University, Stanford, CA, USA
Sarah Mitchell*
School of Engineering, University of Michigan, Ann Arbor, MI, USA
sarah.mitchell@umich.edu
David Thompson
Department of Computer Science, Stanford University, Stanford, CA, USA

DOI: https://doi.org/10.63646/

Abstract

Crisis events concentrate the conditions under which rumors thrive: high uncertainty, intense emotion, and an accelerated demand for information that official channels cannot immediately satisfy. Although a number of valuable public corpora capture fragments of this phenomenon, they were built for different tasks, follow incompatible schemas, use divergent label vocabularies, and rarely preserve the full propagation structure that diffusion analytics requires. This article presents RumorCrisisDB, a relational database design and construction framework that integrates heterogeneous crisis-rumor resources into a single event-centric, cascade-preserving, and annotation-harmonized data model. We first analyze the gap left by existing resources and articulate four use cases that an integrated database must serve: diffusion measurement, detection benchmarking, intervention evaluation, and longitudinal crisis comparison. We then specify the six-entity schema, a six-stage construction pipeline covering re-collection, normalization, linkage, and label harmonization, and the quality-control procedures attached to each stage. The analytics layer is validated through controlled stochastic experiments: Galton–Watson cascade simulations reproduce the heavy-tailed size distributions reported for empirical rumor cascades, and Maki–Thompson-style spreading experiments quantify how the timing of debunking responses changes peak rumor prevalence, with early intervention reducing the simulated peak by more than half relative to a late response. The article closes with the reproducibility and open-access protocol, built on identifier-based redistribution, FAIR principles, and datasheet documentation, together with an explicit account of the design’s limitations.

Article details

How to Cite

Anderson, M., Mitchell, S., & Thompson, D. (2025). RumorCrisisDB: A Social-Media Crisis Rumor Database for Misinformation Diffusion Analytics. DATAMIND, 3(4), 5-28. https://doi.org/10.63646/