基于词典和深度可分离卷积的手写文本识别系统

论文标题

基于词典和深度可分离卷积的手写文本识别系统

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

论文作者

Kumari, Lalita, Singh, Sukhdeep, Rathore, VVS, Sharma, Anuj

论文摘要

草书手写文本识别是模式识别领域中一个具有挑战性的研究问题。当前的最新方法包括基于卷积复发性神经网络和多维长期记忆复发性神经网络技术的模型。这些方法在高度计算上是广泛的，并且在设计级别上也很复杂。在最近的研究中，与基于卷积的复发性神经网络相比，基于卷积神经网络和封闭卷积神经网络模型的组合显示出较少的参数。在减少要训练的参数总数的方向上，在这项工作中，我们使用了深度卷积代替标准卷积，结合了门控跨跨跨性神经网络和双向封闭式复发单元，以减少训练训练的参数的总数。此外，我们还在测试步骤中包括了一个基于词典的单词梁搜索解码器。它还有助于提高模型的整体准确性。我们在IAM数据集上获得了3.84％的字符错误率和9.40％的单词错误率；乔治·华盛顿数据集的角色错误率分别为4.88％和14.56％的单词错误率。

Cursive handwritten text recognition is a challenging research problem in the domain of pattern recognition. The current state-of-the-art approaches include models based on convolutional recurrent neural networks and multi-dimensional long short-term memory recurrent neural networks techniques. These methods are highly computationally extensive as well model is complex at design level. In recent studies, combination of convolutional neural network and gated convolutional neural networks based models demonstrated less number of parameters in comparison to convolutional recurrent neural networks based models. In the direction to reduced the total number of parameters to be trained, in this work, we have used depthwise convolution in place of standard convolutions with a combination of gated-convolutional neural network and bidirectional gated recurrent unit to reduce the total number of parameters to be trained. Additionally, we have also included a lexicon based word beam search decoder at testing step. It also helps in improving the the overall accuracy of the model. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset; 4.88% character error rate and 14.56% word error rate in George Washington dataset, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题