论文标题
计算法律文档相似性的方法:比较研究
Methods for Computing Legal Document Similarity: A Comparative Study
论文作者
论文摘要
在法律信息检索领域,计算两个法律文件之间的相似性是一项重要且具有挑战性的任务。找到类似的法律文件在下游任务中有许多应用程序,包括先前的检索,法律文章的建议等。先前的工作提出了两种衡量法律文档之间相似性的广泛方法 - 分析先例引用网络,并根据文本内容相似性度量衡量相似性。但是,在公共平台上没有对这些现有方法进行全面比较。在本文中,我们对现有方法进行了第一个系统分析。此外,我们探讨了两种有希望的新相似性计算方法 - 一种基于文本的方法,另一种基于网络嵌入的方法,直到现在尚未考虑。
Computing similarity between two legal documents is an important and challenging task in the domain of Legal Information Retrieval. Finding similar legal documents has many applications in downstream tasks, including prior-case retrieval, recommendation of legal articles, and so on. Prior works have proposed two broad ways of measuring similarity between legal documents - analyzing the precedent citation network, and measuring similarity based on textual content similarity measures. But there has not been a comprehensive comparison of these existing methods on a common platform. In this paper, we perform the first systematic analysis of the existing methods. In addition, we explore two promising new similarity computation methods - one text-based and the other based on network embeddings, which have not been considered till now.