论文标题

为Magahi和Braj开发普遍的依赖树库

Developing Universal Dependency Treebanks for Magahi and Braj

论文作者

Raj, Mohit, Ratan, Shyam, Alok, Deepak, Kumar, Ritesh, Ojha, Atul Kr.

论文摘要

在本文中,我们讨论了两种低资源印度语言的树库的开发-Magahi和Braj基于普遍的依赖框架。 Magahi Treebank包含945个句子和Braj Treebank,约有500个句子,上面标有引理,言论部分,形态学特征和普遍依赖性。本文描述了两种语言中发现的不同依赖关系,并给出了两种树库的一些统计数据。该数据集将在下一个(v2.10)中的通用依赖项(UD)存储库(https://github.com/universaldependencies/ud_magahi-mgtb/tree/master)上公开可用(v2.10)。

In this paper, we discuss the development of treebanks for two low-resourced Indian languages - Magahi and Braj based on the Universal Dependencies framework. The Magahi treebank contains 945 sentences and Braj treebank around 500 sentences marked with their lemmas, part-of-speech, morphological features and universal dependencies. This paper gives a description of the different dependency relationship found in the two languages and give some statistics of the two treebanks. The dataset will be made publicly available on Universal Dependency (UD) repository (https://github.com/UniversalDependencies/UD_Magahi-MGTB/tree/master) in the next(v2.10) release.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源