Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale


Recommended citation: Hans WA Hanley, Deepak Kumar, and Zakir Durumeric. "Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale." (2023).

Misinformation, propaganda, and outright lies proliferate on the web, with some of these narratives having dangerous real-world consequences on public health, elections, and individual safety. However, despite the impact that misinformation has on online ecosystems, the research community largely lacks automated and programmatic approaches for tracking narratives across different platforms. In this work, utilizing daily scrapes of 1,633 unreliable news websites, the large-language model MPNet, and DP-Means clustering, we build a system to automatically isolate and analyze the narratives being spread within online ecosystems. Identifying 56,112 separate narratives being spread amongst these 1,633~websites, we describe the most prevalent narratives spread in 2022 and identify the most pivotal websites that originate and magnify narratives. Finally, we show how our system can be utilized to detect new narratives coming from unreliable news websites as well the efficacy of Politifact, Reuters, and AP News, in fact-checking these narratives. With a given story being able to spread for over three months before being fact-checked, our work shows the need for and provides a system to reliably, effectively, and continuously track narratives online.