论文标题
表征用于社交视觉问题的数据集和新的Tinysocial数据集
Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset
论文作者
论文摘要
现代社会智能包括观看视频的能力并回答有关社会和脑海中有关内容的问题,例如,在哈利·波特(Harry Potter)的场景中,“父亲真的对飞行汽车飞行的男孩真的很沮丧吗?”社会视觉问题回答(社会VQA)正在成为研究人类(例如自闭症儿童)和AI代理商的社会推理的宝贵方法。但是,这个问题空间涵盖了视频和问题的巨大变化。我们讨论了创建和表征社交VQA数据集的方法,包括1)众包与内部创作,包括我们创建的两个新数据集的示例比较(Tinysocial-Crowd和Tinysocial-Inhouse)以及先前现有的Social-IQ数据集; 2)一个新的标题,用于表征给定视频的难度和内容; 3)一种用于表征问题类型的新标题。我们通过描述具有良好特征的社会VQA数据集将如何增强AI代理的解释性,并可以为人们提供评估和教育干预措施来结束。
Modern social intelligence includes the ability to watch videos and answer questions about social and theory-of-mind-related content, e.g., for a scene in Harry Potter, "Is the father really upset about the boys flying the car?" Social visual question answering (social VQA) is emerging as a valuable methodology for studying social reasoning in both humans (e.g., children with autism) and AI agents. However, this problem space spans enormous variations in both videos and questions. We discuss methods for creating and characterizing social VQA datasets, including 1) crowdsourcing versus in-house authoring, including sample comparisons of two new datasets that we created (TinySocial-Crowd and TinySocial-InHouse) and the previously existing Social-IQ dataset; 2) a new rubric for characterizing the difficulty and content of a given video; and 3) a new rubric for characterizing question types. We close by describing how having well-characterized social VQA datasets will enhance the explainability of AI agents and can also inform assessments and educational interventions for people.