论文标题
无人机视频审美质量评估的深度多模式学习
Deep Multimodality Learning for UAV Video Aesthetic Quality Assessment
论文作者
论文摘要
尽管无人驾驶汽车(UAV)和航空视频的数量越来越低,但仍有大量研究重点是航空视频的美学,可以提供有价值的信息,以提高航空摄影的美学质量。在本文中,我们提出了一种深入的多模式学习方法,用于无人机视频美学质量评估。更具体地说,一个多式框架旨在利用多种模式的美学属性,包括空间外观,无人机摄像头运动和场景结构。为这个新的多发行框架提出了一个专门设计的新型运动流网络。我们构建了一个数据集,其中有6,000个无人机摄像机捕获的无人机视频镜头。我们的模型可以判断无人机视频是由专业摄影师还是业余爱好者以及场景类型分类拍摄的。实验结果表明,我们的方法优于视频分类方法和传统的基于SVM的视频美学方法。此外,我们还提供了使用拟议方法的三个无人机视频分级,专业段检测和基于美学的无人机路径计划的应用程序示例。
Despite the growing number of unmanned aerial vehicles (UAVs) and aerial videos, there is a paucity of studies focusing on the aesthetics of aerial videos that can provide valuable information for improving the aesthetic quality of aerial photography. In this article, we present a method of deep multimodality learning for UAV video aesthetic quality assessment. More specifically, a multistream framework is designed to exploit aesthetic attributes from multiple modalities, including spatial appearance, drone camera motion, and scene structure. A novel specially designed motion stream network is proposed for this new multistream framework. We construct a dataset with 6,000 UAV video shots captured by drone cameras. Our model can judge whether a UAV video was shot by professional photographers or amateurs together with the scene type classification. The experimental results reveal that our method outperforms the video classification methods and traditional SVM-based methods for video aesthetics. In addition, we present three application examples of UAV video grading, professional segment detection and aesthetic-based UAV path planning using the proposed method.