论文标题

利用低数据设置中的粗粒数据进行事件提取

Utilizing coarse-grained data in low-data settings for event extraction

论文作者

Mutlu, Osman

论文摘要

注释事件信息提取系统的文本数据很难,昂贵且容易出错。我们研究了整合粗粒数据(文档或句子标签)的可行性,这些数据更可行,而不是注释更多文档。除了代币分类的主要任务外,我们还使用了具有两个辅助任务的多任务模型,即文档和句子二进制分类。我们对上述集成进行了一系列的实验。结果表明,尽管引入额外的粗粒数据可提供更大的改进和鲁棒性,但仅增加对任何事件信息的负面文档,仍然可以增加增益。

Annotating text data for event information extraction systems is hard, expensive, and error-prone. We investigate the feasibility of integrating coarse-grained data (document or sentence labels), which is far more feasible to obtain, instead of annotating more documents. We utilize a multi-task model with two auxiliary tasks, document and sentence binary classification, in addition to the main task of token classification. We perform a series of experiments with varying data regimes for the aforementioned integration. Results show that while introducing extra coarse-grained data offers greater improvement and robustness, a gain is still possible with only the addition of negative documents that have no information on any event.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源