Autolv：自动讲座视频生成器

论文标题

Autolv：自动讲座视频生成器

AutoLV: Automatic Lecture Video Generator

论文作者

Wang, Wenbin, Song, Yang, Jha, Sanjay

论文摘要

我们提出了一个端到端的讲座视频生成系统，该系统可以直接从注释的幻灯片，讲师的参考语音和讲师的参考肖像视频中生成真实和完整的教练视频。我们的系统主要由语音合成模块组成，具有很少的扬声器适应器和基于对抗性学习的说话头生成模块。它不仅能够减少讲师的工作量，还可以改变语言和口音，这可以帮助学生更轻松地跟随讲座，并能够更广泛地传播讲座内容。我们的实验结果表明，所提出的模型在真实性，自然性和准确性方面优于其他当前方法。这是我们系统如何工作的视频演示，以及评估和比较的结果：https：//youtu.be/cy6tyki0cog。

We propose an end-to-end lecture video generation system that can generate realistic and complete lecture videos directly from annotated slides, instructor's reference voice and instructor's reference portrait video. Our system is primarily composed of a speech synthesis module with few-shot speaker adaptation and an adversarial learning-based talking-head generation module. It is capable of not only reducing instructors' workload but also changing the language and accent which can help the students follow the lecture more easily and enable a wider dissemination of lecture contents. Our experimental results show that the proposed model outperforms other current approaches in terms of authenticity, naturalness and accuracy. Here is a video demonstration of how our system works, and the outcomes of the evaluation and comparison: https://youtu.be/cY6TYkI0cog.

下载PDF全文

下载文献需遵守相关版权规定

论文标题