论文标题

芬兰人 - 芬兰社交媒体语料库,用于情感极性注释

FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation

论文作者

Lindén, Krister, Jauhiainen, Tommi, Hardwick, Sam

论文摘要

情感分析和意见挖掘是一项重要的任务,在社交媒体中明显的应用领域,例如指示仇恨言论和虚假新闻。在我们对先前工作的调查中,我们注意到没有大规模的社交媒体数据集,其中芬兰语的情感极性注释。该出版物的目的是通过引入27,000个句子数据集以三个本地注释者的情感极性独立于注释,以弥补这一缺点。我们为整个数据集提供了相同的三个注释者,这为随着时间的推移提供了进一步研究注释者行为的独特机会。我们分析了他们的通道间一致性,并提供两个基础来验证数据集的实用性。

Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for the whole data set, which provides a unique opportunity for further studies of annotator behaviour over time. We analyse their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源