从视频中提取知识图

论文标题

从视频中提取知识图

Knowledge Graph Extraction from Videos

论文作者

Mahon, Louis, Giunchiglia, Eleonora, Li, Bowen, Lukasiewicz, Thomas

论文摘要

几乎所有用于自动化视频注释（或字幕）的现有技术使用自然语言句子描述了视频。 However, this has several shortcomings: (i) it is very hard to then further use the generated natural language annotations in automated data processing, (ii) generating natural language annotations requires to solve the hard subtask of generating semantically precise and syntactically correct natural language sentences, which is actually unrelated to the task of video annotation, (iii) it is difficult to quantitatively measure performance, as standard metrics (e.g., accuracy and F1得分）是不适用的，并且（iv）注释特定于语言。在本文中，我们提出了从视频中提取知识图的新任务，即，以知识图的形式产生描述，是给定视频的内容的。由于此任务没有数据集，因此我们还包括一种自动生成它们的方法，从用自然语言注释视频的数据集。然后，我们描述了从视频中提取知识图的初始深度学习模型，并报告了MSVD*和MSR-VTT*的结果，这是使用我们的方法从MSVD和MSR-VTT获得的两个数据集。

Nearly all existing techniques for automated video annotation (or captioning) describe videos using natural language sentences. However, this has several shortcomings: (i) it is very hard to then further use the generated natural language annotations in automated data processing, (ii) generating natural language annotations requires to solve the hard subtask of generating semantically precise and syntactically correct natural language sentences, which is actually unrelated to the task of video annotation, (iii) it is difficult to quantitatively measure performance, as standard metrics (e.g., accuracy and F1-score) are inapplicable, and (iv) annotations are language-specific. In this paper, we propose the new task of knowledge graph extraction from videos, i.e., producing a description in the form of a knowledge graph of the contents of a given video. Since no datasets exist for this task, we also include a method to automatically generate them, starting from datasets where videos are annotated with natural language. We then describe an initial deep-learning model for knowledge graph extraction from videos, and report results on MSVD* and MSR-VTT*, two datasets obtained from MSVD and MSR-VTT using our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题