Visione视频搜索系统：利用现成的文本搜索引擎进行大规模视频检索

论文标题

Visione视频搜索系统：利用现成的文本搜索引擎进行大规模视频检索

The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval

论文作者

Amato, Giuseppe, Bolettieri, Paolo, Carrara, Fabio, Debole, Franca, Falchi, Fabrizio, Gennaro, Claudio, Vadicamo, Lucia, Vairo, Claudio

论文摘要

在本文中，我们在详细信息中描述了一个视频搜索系统，允许用户使用文本关键字，对象的出现及其空间关系，颜色的出现及其空间关系以及图像相似性搜索视频。这些方式可以组合在一起以表达复杂的查询并满足用户需求。我们方法的特殊性在于，我们使用在单个文本检索引擎中使用方便的文本编码索引索引索引的方便的文本编码索引，编码从关键帧中提取的所有信息，例如视觉深度，标签，颜色和对象位置。当必须合并与查询各个部分（视觉，文本和位置）相对应的结果时，这提供了极大的灵活性。此外，我们使用视频浏览器摊牌（VBS）2019年竞赛中生成的查询日志报告了对系统检索性能的广泛分析。这使我们能够通过在我们测试的最佳参数和策略中选择最佳参数和策略来微调系统。

In this paper, we describe in details VISIONE, a video search system that allows users to search for videos using textual keywords, occurrence of objects and their spatial relationships, occurrence of colors and their spatial relationships, and image similarity. These modalities can be combined together to express complex queries and satisfy user needs. The peculiarity of our approach is that we encode all the information extracted from the keyframes, such as visual deep features, tags, color and object locations, using a convenient textual encoding indexed in a single text retrieval engine. This offers great flexibility when results corresponding to various parts of the query (visual, text and locations) have to be merged. In addition, we report an extensive analysis of the system retrieval performance, using the query logs generated during the Video Browser Showdown (VBS) 2019 competition. This allowed us to fine-tune the system by choosing the optimal parameters and strategies among the ones that we tested.

下载PDF全文

下载文献需遵守相关版权规定

论文标题