论文标题

区分类似的北欧语言

Discriminating Between Similar Nordic Languages

论文作者

Haas, René, Derczynski, Leon

论文摘要

自动语言标识是一个具有挑战性的问题。区分密切相关的语言特别困难。本文提出了一种用于北欧语言自动语言识别的机器学习方法,该方法通常会因现有最新工具而遭受错误的分类。具体而言,我们将专注于六种北欧语言之间的歧视:丹麦语,瑞典语,挪威语(Nynorsk),挪威语(Bokmål),法罗斯(Faroese)和冰岛语。

Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokmål), Faroese and Icelandic.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源