PROBOST：基于及时的规则发现和增强交互式弱监督的学习

论文标题

PROBOST：基于及时的规则发现和增强交互式弱监督的学习

PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

论文作者

Zhang, Rongzhi, Yu, Yue, Shetty, Pranav, Song, Le, Zhang, Chao

论文摘要

弱监督的学习（WSL）在解决许多NLP任务上的标签稀缺方面表现出了令人鼓舞的结果，但是手动设计全面，高质量的标签规则集非常乏味和困难。我们研究交互式弱监督的学习 - 迭代和自动从数据中发现新型标签规则的问题以改善WSL模型。我们提出的名为Prboost的模型通过基于迭代的及时的规则发现和模型提升实现了这一目标。它使用增强功能来识别大型误差实例，然后通过提示使用规则模板提示预训练的LMS来发现候选规则。候选规则是由人类专家判断的，被公认的规则用于生成互补的弱标签并加强当前模型。关于四个任务的实验表明，Proost的表现优于最先进的WSL基线，高达7.1％，并用完全有监督的模型弥合空白。我们的实现可在\ url {https://github.com/rz-zhang/prboost}上获得。

Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set is tedious and difficult. We study interactive weakly-supervised learning -- the problem of iteratively and automatically discovering novel labeling rules from data to improve the WSL model. Our proposed model, named PRBoost, achieves this goal via iterative prompt-based rule discovery and model boosting. It uses boosting to identify large-error instances and then discovers candidate rules from them by prompting pre-trained LMs with rule templates. The candidate rules are judged by human experts, and the accepted rules are used to generate complementary weak labels and strengthen the current model. Experiments on four tasks show PRBoost outperforms state-of-the-art WSL baselines up to 7.1% and bridges the gaps with fully supervised models. Our Implementation is available at \url{https://github.com/rz-zhang/PRBoost}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题