论文标题

用于抽象视觉推理的多粒性模块化网络

Multi-Granularity Modularized Network for Abstract Visual Reasoning

论文作者

Tang, Xiangru, Wang, Haoyuan, Pan, Xiang, Qi, Jiyang

论文摘要

抽象的视觉推理将心理能力与物理世界联系起来,这是认知发展的关键因素。大多数幼儿对此技能表现出敏感性,但对于机器来说并不容易。针对它,我们专注于乌鸦渐进式矩阵测试,旨在衡量认知推理。最近的工作设计了一些黑盒子来以端到端的方式解决它,但是它们非常复杂且难以解释。受认知研究的启发,我们提出了一个多粒性模块化网络(MMON),以弥合原始感觉信息的处理与符号推理之间的差距。具体而言,它学习了模块化的推理函数,以通过神经符号和半符号的方式从视觉接地中对语义规则进行建模。为了全面评估MMON,我们的实验是在可见的和看不见的推理规则的数据集中进行的。结果表明,MMON非常适合抽象的视觉推理,并且在概括测试中也可以解释。

Abstract visual reasoning connects mental abilities to the physical world, which is a crucial factor in cognitive development. Most toddlers display sensitivity to this skill, but it is not easy for machines. Aimed at it, we focus on the Raven Progressive Matrices Test, designed to measure cognitive reasoning. Recent work designed some black-boxes to solve it in an end-to-end fashion, but they are incredibly complicated and difficult to explain. Inspired by cognitive studies, we propose a Multi-Granularity Modularized Network (MMoN) to bridge the gap between the processing of raw sensory information and symbolic reasoning. Specifically, it learns modularized reasoning functions to model the semantic rule from the visual grounding in a neuro-symbolic and semi-supervision way. To comprehensively evaluate MMoN, our experiments are conducted on the dataset of both seen and unseen reasoning rules. The result shows that MMoN is well suited for abstract visual reasoning and also explainable on the generalization test.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源