无文本：无文本口语处理的库

论文标题

无文本：无文本口语处理的库

textless-lib: a Library for Textless Spoken Language Processing

论文作者

Kharitonov, Eugene, Copet, Jade, Lakhotia, Kushal, Nguyen, Tu Anh, Tomasello, Paden, Lee, Ann, Elkahky, Ali, Hsu, Wei-Ning, Mohamed, Abdelrahman, Dupoux, Emmanuel, Adi, Yossi

论文摘要

无文本的口语处理研究旨在将标准NLP工具集的适用性扩展到语言和语言，而文字资源很少或没有文字资源。在本文中，我们介绍了一家基于Pytorch的图书馆，旨在促进该研究领域的研究。我们通过讨论三个不同的用例示例来描述图书馆提供和演示其可用性的基础：（i）说话者探测，（ii）语音重新合成和压缩，以及（iii）语音持续。我们认为，无文本限制实质上简化了无文本设置的研究，不仅对于演讲研究人员，而且对于NLP社区而言也将是少数。代码，文档和预训练模型可在https://github.com/facebookresearch/textlesslib/上找到。

Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources. In this paper, we introduce textless-lib, a PyTorch-based library aimed to facilitate research in this research area. We describe the building blocks that the library provides and demonstrate its usability by discuss three different use-case examples: (i) speaker probing, (ii) speech resynthesis and compression, and (iii) speech continuation. We believe that textless-lib substantially simplifies research the textless setting and will be handful not only for speech researchers but also for the NLP community at large. The code, documentation, and pre-trained models are available at https://github.com/facebookresearch/textlesslib/ .

下载PDF全文

下载文献需遵守相关版权规定

论文标题