论文标题
快速的完全八度卷积神经网络,用于文档图像分割
A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation
论文作者
论文摘要
了解您的客户(KYC)和反洗钱(AML)是根据个人身份证明文件,相似性和livese依性检查以及地址证明的在线客户识别的全球实践。要回答基本的法规问题:您是谁说的是吗?客户需要上传有效的身份证文档(ID)。由于这些文档是多种多样的,可能会出现不同且复杂的背景,某些遮挡,部分旋转,质量差或损坏,因此这项任务引起了一些计算挑战。高级文本和文档分割算法用于处理ID图像。在这种情况下,我们研究了一种基于U-NET检测ID图像中文档边缘和文本区域的方法。除了对图像分割的有希望的结果外,基于U-NET的方法对于真实应用程序在计算上昂贵,因为图像分割是客户设备任务。我们提出了一个基于八度卷积的模型优化,以将存储,处理和时间资源有限的情况(例如在移动和机器人应用程序中)限制。我们在两个新数据集CDPhotoDataset和dtddataset中进行了评估实验,这些数据由巴西文档的真实ID图像组成。我们的结果表明,提出的模型有效地记录了分割任务和便携式。
The Know Your Customer (KYC) and Anti Money Laundering (AML) are worldwide practices to online customer identification based on personal identification documents, similarity and liveness checking, and proof of address. To answer the basic regulation question: are you whom you say you are? The customer needs to upload valid identification documents (ID). This task imposes some computational challenges since these documents are diverse, may present different and complex backgrounds, some occlusion, partial rotation, poor quality, or damage. Advanced text and document segmentation algorithms were used to process the ID images. In this context, we investigated a method based on U-Net to detect the document edges and text regions in ID images. Besides the promising results on image segmentation, the U-Net based approach is computationally expensive for a real application, since the image segmentation is a customer device task. We propose a model optimization based on Octave Convolutions to qualify the method to situations where storage, processing, and time resources are limited, such as in mobile and robotic applications. We conducted the evaluation experiments in two new datasets CDPhotoDataset and DTDDataset, which are composed of real ID images of Brazilian documents. Our results showed that the proposed models are efficient to document segmentation tasks and portable.