与情绪有关的问题的BERT辅助语义注释校正

论文标题

与情绪有关的问题的BERT辅助语义注释校正

BERT-Assisted Semantic Annotation Correction for Emotion-Related Questions

论文作者

Kazemzadeh, Abe

论文摘要

传统上，注释的数据被用来提供培训监督机器学习（ML）模型的输入。但是，当前的自然语言处理预训练的ML模型（NLP）包含可用于通知注释过程的嵌入式语言信息。我们使用BERT神经语言模型将信息反馈到一个注释任务中，该任务涉及对话行为的语义标签，以提出问题的游戏为情感二十个问题（EMO20Q）。首先，我们描述BERT的背景，EMO20Q数据和辅助注释任务。然后，我们描述了用于检查带注释的标签的方法，以进行微调BERT。为此，我们使用释义任务来检查是否将所有具有相同注释标签的话语分类为彼此的释义。我们将此方法显示为使用复杂的，语音级别的语义标签评估和修改文本用户数据注释的有效方法。

Annotated data have traditionally been used to provide the input for training a supervised machine learning (ML) model. However, current pre-trained ML models for natural language processing (NLP) contain embedded linguistic information that can be used to inform the annotation process. We use the BERT neural language model to feed information back into an annotation task that involves semantic labelling of dialog behavior in a question-asking game called Emotion Twenty Questions (EMO20Q). First we describe the background of BERT, the EMO20Q data, and assisted annotation tasks. Then we describe the methods for fine-tuning BERT for the purpose of checking the annotated labels. To do this, we use the paraphrase task as a way to check that all utterances with the same annotation label are classified as paraphrases of each other. We show this method to be an effective way to assess and revise annotations of textual user data with complex, utterance-level semantic labels.

下载PDF全文

下载文献需遵守相关版权规定

论文标题