论文标题
基于职位的促进健康结果产生
Position-based Prompting for Health Outcome Generation
论文作者
论文摘要
使用提示探测预训练的语言模型(PLM)已间接暗示可以将语言模型(LMS)视为知识库。为此,这种现象是有效的,尤其是当这些LM不仅对特定领域的数据进行微调,而且还符合提示本身的样式或语言模式时。我们观察到,在提示中满足特定的语言模式是一个不可持续的约束,不必要地延长了探测任务,尤其是因为它们通常是手动设计的,并且可能的提示模板模式的范围可能会根据迅速的目标和域而变化。因此,我们探讨了使用位置注意机制相对于要填充的面具的提示中捕获每个单词的位置信息的想法,因此避免在提示语言模式发生变化时重新构造提示。使用我们的方法,我们证明了对罕见提示模板的答案(在有关健康结果产生的案例研究中)的能力,例如后缀和混合模式,它们的信息分别在开始时和提示的多个随机位置。更重要的是,使用各种生物医学PLM,我们的方法始终优于基线,在该基线中,默认掩码语言模型(MLM)表示用于预测被掩盖的令牌。
Probing Pre-trained Language Models (PLMs) using prompts has indirectly implied that language models (LMs) can be treated as knowledge bases. To this end, this phenomena has been effective especially when these LMs are fine-tuned towards not just data of a specific domain, but also to the style or linguistic pattern of the prompts themselves. We observe that, satisfying a particular linguistic pattern in prompts is an unsustainable constraint that unnecessarily lengthens the probing task, especially because, they are often manually designed and the range of possible prompt template patterns can vary depending on the prompting objective and domain. We therefore explore an idea of using a position-attention mechanism to capture positional information of each word in a prompt relative to the mask to be filled, hence avoiding the need to re-construct prompts when the prompts linguistic pattern changes. Using our approach, we demonstrate the ability of eliciting answers to rare prompt templates (in a case study on health outcome generation) such as Postfix and Mixed patterns whose missing information is respectively at the start and in multiple random places of the prompt. More so, using various biomedical PLMs, our approach consistently outperforms a baseline in which the default mask language model (MLM) representation is used to predict masked tokens.