单点解码网络的场景文本识别

论文标题

单点解码网络的场景文本识别

Scene Text Recognition with Single-Point Decoding Network

论文作者

Chen, Lei, Qin, Haibo, Zhang, Shi-Xue, Yang, Chun, Yin, Xucheng

论文摘要

近年来，基于注意力的场景文本识别方法非常受欢迎，并吸引了许多研究人员的兴趣。基于注意力的方法可以将注意力集中在解码过程中的小区域甚至单点上，其中注意矩阵几乎是一个旋转的分布。此外，在推断过程中，所有注意力矩阵都将加权整个特征图，从而导致巨大的冗余计算。在本文中，我们提出了一个有效的无注意单点解码网络（称为SPDN），用于场景文本识别，该网络可以替代传统的基于注意力的解码网络。具体而言，我们提出单点采样模块（SPSM），以有效地在特征映射上对一个字符进行解码。这样，我们的方法不仅可以精确地找到每个字符的关键点，还可以删除冗余计算。基于SPSM，我们设计了一个高效且新颖的单点解码网络，以替代基于注意力的解码网络。对公开基准测试的广泛实验证明，我们的SPDN可以大大提高解码效率而不牺牲性能。

In recent years, attention-based scene text recognition methods have been very popular and attracted the interest of many researchers. Attention-based methods can adaptively focus attention on a small area or even single point during decoding, in which the attention matrix is nearly one-hot distribution. Furthermore, the whole feature maps will be weighted and summed by all attention matrices during inference, causing huge redundant computations. In this paper, we propose an efficient attention-free Single-Point Decoding Network (dubbed SPDN) for scene text recognition, which can replace the traditional attention-based decoding network. Specifically, we propose Single-Point Sampling Module (SPSM) to efficiently sample one key point on the feature map for decoding one character. In this way, our method can not only precisely locate the key point of each character but also remove redundant computations. Based on SPSM, we design an efficient and novel single-point decoding network to replace the attention-based decoding network. Extensive experiments on publicly available benchmarks verify that our SPDN can greatly improve decoding efficiency without sacrificing performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题