论文标题
使用分层融合的多模式电子商务产品分类
Multimodal E-Commerce Product Classification Using Hierarchical Fusion
论文作者
论文摘要
在这项工作中,我们提出了一种用于商业产品分类的多模式模型,该模型结合了由Textual(Camembert和Flaubert)和视觉数据(SE-Resnext-50)的多个神经网络模型提取的功能,并使用简单的融合技术。所提出的方法显着优于单峰模型的性能和在我们的特定任务上报告的类似模型的性能。我们进行了多种融合技术的实验,并发现,结合单峰网络的个体嵌入的最佳性能技术是基于结合串联和平均特征向量的方法。每种模式都补充了其他方式的缺点,表明增加模态的数量可能是改善多标签和多模式分类问题的有效方法。
In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (CamemBERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the unimodal models' performance and the reported performance of similar models on our specific task. We did experiments with multiple fusing techniques and found, that the best performing technique to combine the individual embedding of the unimodal network is based on combining concatenation and averaging the feature vectors. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label and multimodal classification problems.