深层声音变化：深度迭代学习，卷积神经网络和语言变化

论文标题

深层声音变化：深度迭代学习，卷积神经网络和语言变化

Deep Sound Change: Deep and Iterative Learning, Convolutional Neural Networks, and Language Change

论文作者

Beguš, Gašper

论文摘要

本文提出了一个建模声音变化的框架，将深度学习和迭代学习结合在一起。语音的获取和传播是通过未注释的原始语音数据上的生成对抗网络（GAN）的培训代建模的。本文认为，声音变化的几种属性从提出的架构中浮现出来。 GANs (Goodfellow et al. 2014 arXiv:1406.2661, Donahue et al. 2019 arXiv:1705.07904) are uniquely appropriate for modeling language change because the networks are trained on raw unsupervised acoustic data, contain no language-specific features and, as argued in Beguš (2020 arXiv:2006.03965), encode phonetic and phonological representations in their潜在空间并产生语言信息丰富的创新数据。第一代网络对蒂米特人类言语的相关序列进行了培训。随后的几代不是对蒂姆的培训，而是对上一代产生的输出的培训，因此在迭代学习任务中开始相互学习。每一代人的初始异源分布逐渐丢失，这可能是由于训练数据中攻值的全球分布所带来的压力。网络显示出逐渐变化的语音目标特征的逐渐变化的迹象。在端点，输出表面上类似于语音变化 - 规则丢失。

This paper proposes a framework for modeling sound change that combines deep learning and iterative learning. Acquisition and transmission of speech is modeled by training generations of Generative Adversarial Networks (GANs) on unannotated raw speech data. The paper argues that several properties of sound change emerge from the proposed architecture. GANs (Goodfellow et al. 2014 arXiv:1406.2661, Donahue et al. 2019 arXiv:1705.07904) are uniquely appropriate for modeling language change because the networks are trained on raw unsupervised acoustic data, contain no language-specific features and, as argued in Beguš (2020 arXiv:2006.03965), encode phonetic and phonological representations in their latent space and generate linguistically informative innovative data. The first generation of networks is trained on the relevant sequences in human speech from TIMIT. The subsequent generations are not trained on TIMIT, but on generated outputs from the previous generation and thus start learning from each other in an iterative learning task. The initial allophonic distribution is progressively being lost with each generation, likely due to pressures from the global distribution of aspiration in the training data. The networks show signs of a gradual shift in phonetic targets characteristic of a gradual phonetic sound change. At endpoints, the outputs superficially resemble a phonological change -- rule loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题