论文标题
可解释和不确定性的多任务多任务框架,用于多样性情感分析
An Interpretable and Uncertainty Aware Multi-Task Framework for Multi-Aspect Sentiment Analysis
论文作者
论文摘要
近年来,几个在线平台的审核系统数量迅速增加,这些评论系统要求用户提供方面级别的反馈。文档级的多种情感分类(DMSC)的目标是在个人方面层面预测评论的评分/情感,已成为一个具有挑战性且迫在眉睫的问题。为了应对这一挑战,我们提出了一个故意的基于自我注意力的深度神经网络模型,即Fedar对于DMSC问题,该模型可以实现竞争性能,同时也能够解释预测。 Fedar配备了一个高速公路单词嵌入层,可从预训练的单词嵌入式,一个带有池化和分解技术丰富的输出特征的RNN编码层转移知识,以及故意的自我发场层。此外,我们还提出了一个注意力驱动的关键字排名(AKR)方法,该方法可以根据注意力的权重自动从评论语料库中发现方面关键字和方面级别的意见关键字。这些关键字对于Fedar的评级预测至关重要。由于众包注释可以是收回丢失评论评分的替代方法,因此我们提出了一种讲座 - 策略(LEAD)策略,以估算多任务学习背景下的模型不确定性,因此有价值的人力资源可以专注于最不确定的预测。我们在五个不同的开放域DMSC数据集上进行的一组广泛的实验表明了所提出的Fedar和铅模型的优越性。我们进一步在医疗领域和基准测试不同的基线模型以及我们的模型上进一步介绍了两个新的DMSC数据集。注意权重的可视化结果和方面关键词的可视化表明了我们的模型的解释性以及我们AKR方法的有效性。
In recent years, several online platforms have seen a rapid increase in the number of review systems that request users to provide aspect-level feedback. Document-level Multi-aspect Sentiment Classification (DMSC), where the goal is to predict the ratings/sentiment from a review at an individual aspect level, has become a challenging and imminent problem. To tackle this challenge, we propose a deliberate self-attention-based deep neural network model, namely FEDAR, for the DMSC problem, which can achieve competitive performance while also being able to interpret the predictions made. FEDAR is equipped with a highway word embedding layer to transfer knowledge from pre-trained word embeddings, an RNN encoder layer with output features enriched by pooling and factorization techniques, and a deliberate self-attention layer. In addition, we also propose an Attention-driven Keywords Ranking (AKR) method, which can automatically discover aspect keywords and aspect-level opinion keywords from the review corpus based on the attention weights. These keywords are significant for rating predictions by FEDAR. Since crowdsourcing annotation can be an alternate way to recover missing ratings of reviews, we propose a LEcture-AuDience (LEAD) strategy to estimate model uncertainty in the context of multi-task learning, so that valuable human resources can focus on the most uncertain predictions. Our extensive set of experiments on five different open-domain DMSC datasets demonstrate the superiority of the proposed FEDAR and LEAD models. We further introduce two new DMSC datasets in the healthcare domain and benchmark different baseline models and our models on them. Attention weights visualization results and visualization of aspect and opinion keywords demonstrate the interpretability of our model and the effectiveness of our AKR method.