注意力HTR：基于注意编码器网络的手写文本识别

论文标题

注意力HTR：基于注意编码器网络的手写文本识别

AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

论文作者

Kass, Dmitrijs, Vats, Ekta

论文摘要

这项工作提出了一个基于注意力的序列到序列模型，用于手写单词识别，并探讨了用于HTR系统数据有效培训的转移学习。为了克服培训数据稀缺性，这项工作利用了在场景文本图像上预先培训的模型，作为调整手写识别模型的起点。 Resnet特征提取和基于双向LSTM的序列建模阶段一起形成编码器。预测阶段由解码器和基于内容的注意机制组成。所提出的端到端HTR系统的有效性已在新型的多作用数据集IMGUR5K和IAM数据集上进行了经验评估。实验结果评估了HTR框架的性能，并通过对误差案例的深入分析进一步支持。源代码和预培训模型可在https://github.com/dmitrijsk/attentionhtr上找到。

This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at https://github.com/dmitrijsk/AttentionHTR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题