论文标题
通过元定义的多模式融合
Multi-Modal Fusion by Meta-Initialization
论文作者
论文摘要
当经验稀缺时,模型可能没有足够的信息来适应新任务。在这种情况下,辅助信息(例如对任务的文本描述)可以启用改进的任务推断和适应性。在这项工作中,我们提出了模型 - 不合时宜的元学习算法(MAML)的扩展,该算法允许模型使用辅助信息以及任务体验适应。我们的方法是元定位化(FUMI)的融合,使用超网络对辅助信息进行了模型初始化,而不是学习一个单一的任务无关初始化。此外,在现有多模式的几杆学习基准的缺点的推动下,我们构建了Inat-Anim-一个大规模的图像分类数据集,具有简洁且视觉上相关的文本类别描述。在INAT-ANIM上,FUMI在几杆状态下明显胜过诸如MAML之类的单模式基线。该项目的代码和INAT-ANIM的数据集探索工具可在https://github.com/s-a-a-malik/multi-few上公开获得。
When experience is scarce, models may have insufficient information to adapt to a new task. In this case, auxiliary information - such as a textual description of the task - can enable improved task inference and adaptation. In this work, we propose an extension to the Model-Agnostic Meta-Learning algorithm (MAML), which allows the model to adapt using auxiliary information as well as task experience. Our method, Fusion by Meta-Initialization (FuMI), conditions the model initialization on auxiliary information using a hypernetwork, rather than learning a single, task-agnostic initialization. Furthermore, motivated by the shortcomings of existing multi-modal few-shot learning benchmarks, we constructed iNat-Anim - a large-scale image classification dataset with succinct and visually pertinent textual class descriptions. On iNat-Anim, FuMI significantly outperforms uni-modal baselines such as MAML in the few-shot regime. The code for this project and a dataset exploration tool for iNat-Anim are publicly available at https://github.com/s-a-malik/multi-few .