种子：语义增强了场景文本识别的编码器编码器框架

论文标题

种子：语义增强了场景文本识别的编码器编码器框架

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

论文作者

Qiao, Zhi, Zhou, Yu, Yang, Dongbao, Zhou, Yucan, Wang, Weiping

论文摘要

场景文本识别是计算机视觉中的热门研究主题。最近，已经提出了许多基于编码器框架框架的识别方法，它们可以处理透视失真和曲线形状的场景文本。然而，他们仍然面临许多挑战，例如图像模糊，不均匀的照明和不完整的角色。我们认为，大多数编码器解码器方法基于本地视觉特征，而无需明确的全局语义信息。在这项工作中，我们提出了一种语义增强的编码器框架，以稳健地识别低质量的场景文本。语义信息在编码器模块中均用于监督和解码器模块中进行初始化。特别是，最先进的Aster方法被整合到拟议的框架中。广泛的实验表明，所提出的框架对于低质量的文本图像更强大，并在多个基准数据集上实现了最先进的结果。

Scene text recognition is a hot research topic in computer vision. Recently, many recognition methods based on the encoder-decoder framework have been proposed, and they can handle scene texts of perspective distortion and curve shape. Nevertheless, they still face lots of challenges like image blur, uneven illumination, and incomplete characters. We argue that most encoder-decoder methods are based on local visual features without explicit global semantic information. In this work, we propose a semantics enhanced encoder-decoder framework to robustly recognize low-quality scene texts. The semantic information is used both in the encoder module for supervision and in the decoder module for initializing. In particular, the state-of-the art ASTER method is integrated into the proposed framework as an exemplar. Extensive experiments demonstrate that the proposed framework is more robust for low-quality text images, and achieves state-of-the-art results on several benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题