论文标题
hake:知识引擎基础,用于人类活动的理解
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
论文作者
论文摘要
人类活动的理解是对人工智能的普遍兴趣,并且涵盖了医疗保健和行为分析等多种应用。尽管深度学习取得了进步,但仍然具有挑战性。类似对象识别的解决方案通常试图将像素直接映射到语义上,但是活动模式与对象模式有很大不同,从而阻碍了成功。在这项工作中,我们提出了一个新颖的范式,以在两个阶段重新重新重新制定这项任务:首先将像素映射到原子活动原始范围跨越的中间空间,然后对具有可解释的逻辑规则的原子探测的原始词来推断语义。为了提供代表性的原始空间,我们建立了一个知识库,其中包括人类先验或自动发现的26多个原始标签和逻辑规则。我们的框架是人类活动知识引擎(HAKE),在具有挑战性的基准方面具有卓越的概括能力和性能。代码和数据可在http://hake-mvig.cn/上找到。
Human activity understanding is of widespread interest in artificial intelligence and spans diverse applications like health care and behavior analysis. Although there have been advances in deep learning, it remains challenging. The object recognition-like solutions usually try to map pixels to semantics directly, but activity patterns are much different from object patterns, thus hindering success. In this work, we propose a novel paradigm to reformulate this task in two stages: first mapping pixels to an intermediate space spanned by atomic activity primitives, then programming detected primitives with interpretable logic rules to infer semantics. To afford a representative primitive space, we build a knowledge base including 26+ M primitive labels and logic rules from human priors or automatic discovering. Our framework, the Human Activity Knowledge Engine (HAKE), exhibits superior generalization ability and performance upon canonical methods on challenging benchmarks. Code and data are available at http://hake-mvig.cn/.