使用文本和视觉信息的数字传单中促销的多标签分类

论文标题

使用文本和视觉信息的数字传单中促销的多标签分类

Multi-label classification of promotions in digital leaflets using textual and visual information

论文作者

Arroyo, Roberto, Jiménez-Cabello, David, Martínez-Cebrián, Javier

论文摘要

电子商务平台中的产品描述包含有关零售商分类的详细和有价值的信息。特别是，数字传单中的编码促销对电子商务引起了极大的兴趣，因为它们通过显示不同产品的定期促销来吸引消费者的注意。但是，这些信息嵌入了图像中，因此很难提取和处理下游任务。在本文中，我们提出了一种端到端的方法，该方法将数字传单中的促销活动分为其相应的产品类别，同时使用视觉和文本信息。我们的方法可以分为三个关键组成部分：1）区域检测，2）文本识别和3）文本分类。在许多情况下，单个促销是指多种产品类别，因此我们在分类头中引入了一个多标签目标。我们证明了我们方法对两个分开任务的有效性：1）使用产品描述中的文本对产品类别的每个单个促销的描述进行基于图像的检测。我们使用由Nielsen获得的数字传单的图像组成的私人数据集训练和评估我们的模型。结果表明，在所有实验中，我们始终胜过所提出的基线。

Product descriptions in e-commerce platforms contain detailed and valuable information about retailers assortment. In particular, coding promotions within digital leaflets are of great interest in e-commerce as they capture the attention of consumers by showing regular promotions for different products. However, this information is embedded into images, making it difficult to extract and process for downstream tasks. In this paper, we present an end-to-end approach that classifies promotions within digital leaflets into their corresponding product categories using both visual and textual information. Our approach can be divided into three key components: 1) region detection, 2) text recognition and 3) text classification. In many cases, a single promotion refers to multiple product categories, so we introduce a multi-label objective in the classification head. We demonstrate the effectiveness of our approach for two separated tasks: 1) image-based detection of the descriptions for each individual promotion and 2) multi-label classification of the product categories using the text from the product descriptions. We train and evaluate our models using a private dataset composed of images from digital leaflets obtained by Nielsen. Results show that we consistently outperform the proposed baseline by a large margin in all the experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题