Dataperf：以数据为中心AI开发的基准

论文标题

Dataperf：以数据为中心AI开发的基准

DataPerf: Benchmarks for Data-Centric AI Development

论文作者

Mazumder, Mark, Banbury, Colby, Yao, Xiaozhe, Karlaš, Bojan, Rojas, William Gaviria, Diamos, Sudnya, Diamos, Greg, He, Lynn, Parrish, Alicia, Kirk, Hannah Rose, Quaye, Jessica, Rastogi, Charvi, Kiela, Douwe, Jurado, David, Kanter, David, Mosquera, Rafael, Ciro, Juan, Aroyo, Lora, Acun, Bilge, Chen, Lingjiao, Raje, Mehul Smriti, Bartolo, Max, Eyuboglu, Sabri, Ghorbani, Amirata, Goodman, Emmett, Inel, Oana, Kane, Tariq, Kirkpatrick, Christine R., Kuo, Tzu-Sheng, Mueller, Jonas, Thrush, Tristan, Vanschoren, Joaquin, Warren, Margaret, Williams, Adina, Yeung, Serena, Ardalani, Newsha, Paritosh, Praveen, Bat-Leah, Lilith, Zhang, Ce, Zou, James, Wu, Carole-Jean, Coleman, Cody, Ng, Andrew, Mattson, Peter, Reddi, Vijay Janapa

论文摘要

机器学习研究长期以来一直集中在模型而不是数据集上，并且突出的数据集用于常见的ML任务，而无需考虑基本问题的广度，困难和忠诚。忽略数据的基本重要性已导致现实应用程序中的不准确性，偏见和脆弱性，并且在现有数据集基准中饱和的阻碍了研究。作为回应，我们提出了Dataperf，这是一个社区主导的基准套件，用于评估ML数据集和以数据为中心的算法。我们旨在通过竞争，可比性和可重复性来促进以数据为中心AI的创新。我们使ML社区能够在数据集上进行迭代，而不仅仅是架构，并且我们提供了一个开放的在线平台，并具有多种挑战，以支持这种迭代性开发。 DataPerf的第一次迭代包含五个基准，涵盖了各种以数据为中心的技术，任务和方式，视觉，语音，获取，调试和扩散提示，我们支持托管社区的新贡献基准。基准，在线评估平台和基线实施是开源的，MLCommons协会将维护Dataperf，以确保对学术界和行业的长期收益。

Machine learning research has long focused on models rather than datasets, and prominent datasets are used for common ML tasks without regard to the breadth, difficulty, and faithfulness of the underlying problems. Neglecting the fundamental importance of data has given rise to inaccuracy, bias, and fragility in real-world applications, and research is hindered by saturation across existing dataset benchmarks. In response, we present DataPerf, a community-led benchmark suite for evaluating ML datasets and data-centric algorithms. We aim to foster innovation in data-centric AI through competition, comparability, and reproducibility. We enable the ML community to iterate on datasets, instead of just architectures, and we provide an open, online platform with multiple rounds of challenges to support this iterative development. The first iteration of DataPerf contains five benchmarks covering a wide spectrum of data-centric techniques, tasks, and modalities in vision, speech, acquisition, debugging, and diffusion prompting, and we support hosting new contributed benchmarks from the community. The benchmarks, online evaluation platform, and baseline implementations are open source, and the MLCommons Association will maintain DataPerf to ensure long-term benefits to academia and industry.

下载PDF全文

下载文献需遵守相关版权规定

论文标题