令人惊讶的直接场景文本删除方法具有封闭的注意力和兴趣的产生区域：全面的突出模型分析

论文标题

令人惊讶的直接场景文本删除方法具有封闭的注意力和兴趣的产生区域：全面的突出模型分析

The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis

论文作者

Lee, Hyeonsu, Choi, Chankyu

论文摘要

场景文本删除（STR）是一项从自然场景图像中删除文本的任务，最近引起了人们的关注，这是编辑文本或隐藏私人信息（例如ID，电话和车牌号）的重要组成部分。尽管有多种不同的方法可以主动研究Str，但是很难评估优势，因为先前提出的方法不使用相同的标准化培训/评估数据集。我们使用相同的标准化培训/测试数据集来评估标准化重新实现后的几种先前方法的性能。我们还引入了本文中简单但极有效的封闭式关注（GA）和利益产生（ROIG）方法。 GA利用注意力集中在文本中风以及周围区域的纹理和颜色上，以更精确地从输入图像中删除文本。 ROIG仅用于使用文本而不是整个图像的区域，以更有效地训练模型。基准数据集的实验结果表明，我们的方法在几乎所有具有较高质量结果的指标中都大大优于现有的最新方法。此外，由于我们的模型没有明确生成文本蒙版，因此不需要其他细化步骤或子模型，从而使我们的模型具有更少的参数。该数据集和代码可在此https://github.com/naver/garnet上找到。

Scene text removal (STR), a task of erasing text from natural scene images, has recently attracted attention as an important component of editing text or concealing private information such as ID, telephone, and license plate numbers. While there are a variety of different methods for STR actively being researched, it is difficult to evaluate superiority because previously proposed methods do not use the same standardized training/evaluation dataset. We use the same standardized training/testing dataset to evaluate the performance of several previous methods after standardized re-implementation. We also introduce a simple yet extremely effective Gated Attention (GA) and Region-of-Interest Generation (RoIG) methodology in this paper. GA uses attention to focus on the text stroke as well as the textures and colors of the surrounding regions to remove text from the input image much more precisely. RoIG is applied to focus on only the region with text instead of the entire image to train the model more efficiently. Experimental results on the benchmark dataset show that our method significantly outperforms existing state-of-the-art methods in almost all metrics with remarkably higher-quality results. Furthermore, because our model does not generate a text stroke mask explicitly, there is no need for additional refinement steps or sub-models, making our model extremely fast with fewer parameters. The dataset and code are available at this https://github.com/naver/garnet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题