食谱NAP-轻巧的图像到配件模型

论文标题

食谱NAP-轻巧的图像到配件模型

RecipeSnap -- a lightweight image-to-recipe model

论文作者

Chen, Jianfa, Yin, Yue, Xu, Yifan

论文摘要

在本文中，我们想解决自动化问题，以识别照片烹饪菜肴并产生相应的食物食谱。当前的图像到配置模型的计算昂贵，需要强大的GPU才能进行模型培训和实施。较高的计算成本阻止了这些现有模型在智能手机等便携式设备上部署。为了解决此问题，我们引入了一个轻巧的图像到配件预测模型，该模型将记忆成本和计算成本降低超过90％，同时仍达到2.0 MEDR，这与最先进的模型一致。使用预训练的配方编码器来计算配方嵌入。配方1M数据集和相应的配方嵌入的配方被收集为食谱库，该食谱库用于图像编码器培训和图像查询。我们将Mobilenet-V2用作图像编码器主链，这使我们的模型适合于便携式设备。该模型可以进一步发展为智能手机的应用程序。本文介绍了这种轻巧模型与其他重型模型之间的性能的比较。代码，数据和模型可在GitHub上公开访问。

In this paper we want to address the problem of automation for recognition of photographed cooking dishes and generating the corresponding food recipes. Current image-to-recipe models are computation expensive and require powerful GPUs for model training and implementation. High computational cost prevents those existing models from being deployed on portable devices, like smart phones. To solve this issue we introduce a lightweight image-to-recipe prediction model, RecipeSnap, that reduces memory cost and computational cost by more than 90% while still achieving 2.0 MedR, which is in line with the state-of-the-art model. A pre-trained recipe encoder was used to compute recipe embeddings. Recipes from Recipe1M dataset and corresponding recipe embeddings are collected as a recipe library, which are used for image encoder training and image query later. We use MobileNet-V2 as image encoder backbone, which makes our model suitable to portable devices. This model can be further developed into an application for smart phones with a few effort. A comparison of the performance between this lightweight model to other heavy models are presented in this paper. Code, data and models are publicly accessible on github.

下载PDF全文

下载文献需遵守相关版权规定

论文标题