论文标题
具有自适应阈值的多模式表示学习用于商品验证
Multi-Modal Representation Learning with Self-Adaptive Threshold for Commodity Verification
论文作者
论文摘要
在本文中,我们提出了一种识别相同商品的方法。在电子商务方案中,商品通常由图像和文本描述。根据定义,相同的商品是具有相同关键属性并且认知与消费者相同的商品。有两个主要挑战:1)多模式表示的提取和融合。 2)通过比较表示形式和阈值之间的相似性来验证相同商品的能力。为了解决上述问题,我们提出了一种具有自适应阈值的端到端多模式表示方法。我们使用双流网络来分别提取多模式的商品嵌入和阈值嵌入,然后将它们串联以获得商品表示。我们的方法能够根据不同商品的不同商品自适应调整阈值,同时保持商品表示空间的索引。我们通过实验验证自适应阈值的优势和多模式表示融合的有效性。此外,在CCKS-2022知识图评估数字商务竞赛的第二任任务中,我们的方法以0.8936的F1得分获得第三名。代码和预估计的模型可在https://github.com/hanchenchen/ccks2022-track2-solution上找到。
In this paper, we propose a method to identify identical commodities. In e-commerce scenarios, commodities are usually described by both images and text. By definition, identical commodities are those that have identical key attributes and are cognitively identical to consumers. There are two main challenges: 1) The extraction and fusion of multi-modal representation. 2) The ability to verify identical commodities by comparing the similarity between representations and a threshold. To address the above problems, we propose an end-to-end multi-modal representation learning method with self-adaptive threshold. We use a dual-stream network to extract multi-modal commodity embeddings and threshold embeddings separately and then concatenate them to obtain commodity representation. Our method is able to adaptively adjust the threshold according to different commodities while maintaining the indexability of the commodity representation space. We experimentally validate the advantages of self-adaptive threshold and the effectiveness of multimodal representation fusion. Besides, our method achieves third place with an F1 score of 0.8936 on the second task of the CCKS-2022 Knowledge Graph Evaluation for Digital Commerce Competition. Code and pretrained models are available at https://github.com/hanchenchen/CCKS2022-track2-solution.