透射式：一种带有变压器的End2End Fusion方法用于多模式分析

论文标题

透射式：一种带有变压器的End2End Fusion方法用于多模式分析

TransModality: An End2End Fusion Method with Transformer for Multimodal Sentiment Analysis

论文作者

Wang, Zilong, Wan, Zhaohong, Wan, Xiaojun

论文摘要

多模式情感分析是一个重要的研究领域，可以通过从文本，视觉和声学方式中提取的特征来预测说话者的情感趋势。中心挑战是多模式信息的融合方法。已经提出了各种融合方法，但是很少有人采用端到端翻译模型来挖掘方式之间的微妙相关性。通过在机器翻译领域的最新成功，我们提出了一种新的融合方法，即跨模式，以解决多模式情感分析的任务。我们假设方式之间的翻译有助于更好地表示说话者的话语。借助变压器，学习的功能从源模式和目标模态体现了信息。我们在多个多模式数据集上验证我们的模型：CMU-MOSI，MELD，IEMOCAP。实验表明，我们提出的方法实现了最新的性能。

Multimodal sentiment analysis is an important research area that predicts speaker's sentiment tendency through features extracted from textual, visual and acoustic modalities. The central challenge is the fusion method of the multimodal information. A variety of fusion methods have been proposed, but few of them adopt end-to-end translation models to mine the subtle correlation between modalities. Enlightened by recent success of Transformer in the area of machine translation, we propose a new fusion method, TransModality, to address the task of multimodal sentiment analysis. We assume that translation between modalities contributes to a better joint representation of speaker's utterance. With Transformer, the learned features embody the information both from the source modality and the target modality. We validate our model on multiple multimodal datasets: CMU-MOSI, MELD, IEMOCAP. The experiments show that our proposed method achieves the state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题