论文标题
人工开放世界评估AGI:概念设计
Artificial Open World for Evaluating AGI: a Conceptual Design
论文作者
论文摘要
如何评估人工智能(AGI)是一个关键问题,长期讨论和无法解决。在狭窄AI的研究中,这似乎不是一个严重的问题,因为该领域的研究人员着重于某些特定问题以及认知的一个或某些方面,并且明确定义了评估标准。相比之下,AGI代理应该解决代理商和开发人员从未遇到的问题。但是,一旦开发人员测试并通过问题调试代理,从未遇到的问题就变成了遇到的问题,结果,开发人员在某种程度上解决了问题,从而利用了他们的经验,而不是代理商。正如我们所说的开发人员经验的陷阱一样,这种冲突导致这种问题可能很难成为公认的标准。在本文中,我们提出了一种名为“人工开放世界”的评估方法,旨在跳出陷阱。直觉是,在现实世界中,大多数经验都不需要应用于人造世界,并且在某种意义上应该开放世界,以便开发人员在测试之前无法自行了解世界并解决问题,尽管此后他们可以检查所有数据。世界的产生与现实世界相似,并提出了一般的问题形式。提出了一个指标,目的是量化研究的进度。本文描述了人工开放世界的概念设计,尽管形式化和实施却留给了未来。
How to evaluate Artificial General Intelligence (AGI) is a critical problem that is discussed and unsolved for a long period. In the research of narrow AI, this seems not a severe problem, since researchers in that field focus on some specific problems as well as one or some aspects of cognition, and the criteria for evaluation are explicitly defined. By contrast, an AGI agent should solve problems that are never-encountered by both agents and developers. However, once a developer tests and debugs the agent with a problem, the never-encountered problem becomes the encountered problem, as a result, the problem is solved by the developers to some extent, exploiting their experience, rather than the agents. This conflict, as we call the trap of developers' experience, leads to that this kind of problems is probably hard to become an acknowledged criterion. In this paper, we propose an evaluation method named Artificial Open World, aiming to jump out of the trap. The intuition is that most of the experience in the actual world should not be necessary to be applied to the artificial world, and the world should be open in some sense, such that developers are unable to perceive the world and solve problems by themselves before testing, though after that they are allowed to check all the data. The world is generated in a similar way as the actual world, and a general form of problems is proposed. A metric is proposed aiming to quantify the progress of research. This paper describes the conceptual design of the Artificial Open World, though the formalization and the implementation are left to the future.