cdialog：实体感知对话框生成的多转向互联-19对话数据集

论文标题

cdialog：实体感知对话框生成的多转向互联-19对话数据集

CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware Dialog Generation

论文作者

Varshney, Deeksha, Zafar, Aizan, Behra, Niranshu Kumar, Ekbal, Asif

论文摘要

与患者相互作用并提供临床建议的对话剂的发展吸引了许多研究人员的兴趣，特别是鉴于Covid-19的大流行。另一方面，由于缺乏多转向医疗对话语料库的培训，对端到端神经对话系统的培训受到了阻碍。我们首次尝试发布与名为CDIAGOG的COVID-19疾病有关的高质量多转向医疗对话数据集，并从在线医疗咨询网站收集了超过1K的对话。我们用七个不同类别的医疗实体（包括疾病，症状，医疗检查，医疗病史，补救措施，药物和其他方面）作为其他标签来注释对话的每一个话语。最后，我们根据CDIALOG数据集提出了一个新型的神经医学对话系统，以推动对开发自动化医疗对话系统的未来研究。我们使用预训练的语言模型来生成对话，并结合了带注释的医学实体，以产生一个虚拟医生的反应，以解决患者的查询。实验结果表明，当补充实体信息并因此可以提高响应质量时，提出的对话模型的性能相当相当。

The development of conversational agents to interact with patients and deliver clinical advice has attracted the interest of many researchers, particularly in light of the COVID-19 pandemic. The training of an end-to-end neural based dialog system, on the other hand, is hampered by a lack of multi-turn medical dialog corpus. We make the very first attempt to release a high-quality multi-turn Medical Dialog dataset relating to Covid-19 disease named CDialog, with over 1K conversations collected from the online medical counselling websites. We annotate each utterance of the conversation with seven different categories of medical entities, including diseases, symptoms, medical tests, medical history, remedies, medications and other aspects as additional labels. Finally, we propose a novel neural medical dialog system based on the CDialog dataset to advance future research on developing automated medical dialog systems. We use pre-trained language models for dialogue generation, incorporating annotated medical entities, to generate a virtual doctor's response that addresses the patient's query. Experimental results show that the proposed dialog models perform comparably better when supplemented with entity information and hence can improve the response quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题