论文标题
Poliwam:探索大规模的关于WhatsApp Messenger的政治讨论
PoliWAM: An Exploration of a Large Scale Corpus of Political Discussions on WhatsApp Messenger
论文作者
论文摘要
WhatsApp Messenger是最受欢迎的渠道之一,目前拥有180多个国家和20亿人的信息。它的广泛使用使其成为在任何社交活动中群众中信息传播的最受欢迎的媒体之一。最近,几个国家目睹了其在政治和社会运动中的有效性和影响力。我们观察到大选期间的信息和宣传流程高涨。在本文中,我们探讨了由281个组,31,078个唯一用户组成的WhatsApp策划的高质量的大规模用户生成的数据集,并在2019年印度大选期间,期间和之后共享了223,404条消息,涵盖了所有主要印度政党和领导者。除了原始的嘈杂用户生成的数据外,我们还提供了3,848条消息的精细注释的数据集,这些数据集将有助于了解WhatsApp政治竞选活动的各个方面。我们对同一时期的调查和轰动性新闻报道提出了一些补充见解。探索性数据分析和实验展示了一些令人兴奋的结果和未来的研究机会。为了促进可再现的研究,我们使公共领域中的匿名数据集可用。
WhatsApp Messenger is one of the most popular channels for spreading information with a current reach of more than 180 countries and 2 billion people. Its widespread usage has made it one of the most popular media for information propagation among the masses during any socially engaging event. In the recent past, several countries have witnessed its effectiveness and influence in political and social campaigns. We observe a high surge in information and propaganda flow during election campaigning. In this paper, we explore a high-quality large-scale user-generated dataset curated from WhatsApp comprising of 281 groups, 31,078 unique users, and 223,404 messages shared before, during, and after the Indian General Elections 2019, encompassing all major Indian political parties and leaders. In addition to the raw noisy user-generated data, we present a fine-grained annotated dataset of 3,848 messages that will be useful to understand the various dimensions of WhatsApp political campaigning. We present several complementary insights into the investigative and sensational news stories from the same period. Exploratory data analysis and experiments showcase several exciting results and future research opportunities. To facilitate reproducible research, we make the anonymized datasets available in the public domain.