Dynatask：创建动态AI基准任务的框架

论文标题

Dynatask：创建动态AI基准任务的框架

Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks

论文作者

Thrush, Tristan, Tirumala, Kushal, Gupta, Anmol, Bartolo, Max, Rodriguez, Pedro, Kane, Tariq, Rojas, William Gaviria, Mattson, Peter, Williams, Adina, Kiela, Douwe

论文摘要

我们介绍了Dynatask：一个开源系统，用于设置自定义NLP任务，旨在大大降低托管和评估最先进的NLP模型所需的技术知识和精力，以及与CrowdWorkers在Loop数据收集中进行模型。 Dynatask与Dynabench集成在一起，Dynabench是一个研究平台，用于在AI中重新思考基准测试，以促进循环数据收集和评估中的人类和模型。要创建一个任务，用户只需要编写一个简短的任务配置文件，从中可以自动生成相关的Web界面和模型托管基础结构。该系统可在https://dynabench.org/上找到，可以在https://github.com/facebookresearch/dynabench上找到完整库。

We introduce Dynatask: an open source system for setting up custom NLP tasks that aims to greatly lower the technical knowledge and effort required for hosting and evaluating state-of-the-art NLP models, as well as for conducting model in the loop data collection with crowdworkers. Dynatask is integrated with Dynabench, a research platform for rethinking benchmarking in AI that facilitates human and model in the loop data collection and evaluation. To create a task, users only need to write a short task configuration file from which the relevant web interfaces and model hosting infrastructure are automatically generated. The system is available at https://dynabench.org/ and the full library can be found at https://github.com/facebookresearch/dynabench.

下载PDF全文

下载文献需遵守相关版权规定

论文标题