部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Context-Aware Abbreviation Expansion Using Large Language Models

论文作者

Cai, Shanqing, Venugopalan, Subhashini, Tomanek, Katrin, Narayanan, Ajit, Morris, Meredith Ringel, Brenner, Michael P.

论文摘要

由于需要加速具有严重运动障碍的人的增强和替代通信（AAC）的文本输入的动机，我们提出了一个范式，其中将短语缩写为主要是单词初始字母。我们的方法是通过利用验证的大语言模型（LLMS）的力量利用对话上下文，将缩写缩写扩展到全词选项中。通过在四个公开对话数据集上进行零射击，很少射击和微调实验，我们表明，对于对话框的初始转题，具有64B参数的LLM能够准确地扩展超过70％的短语，这些短语的缩写长度最高为10，可有效地节省这些量，可在这些确切的扩展率上添加到大约77％上，最高为77％。与没有上下文相比，以单一对话形式包括少量上下文的缩写膨胀精度超过了缩写的膨胀精度，这种效果对于更长的短语而言更为明显。此外，通过对嘈杂数据进行微调，可以增强模型对错别字噪声的鲁棒性。

Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language models (LLMs). Through zero-shot, few-shot, and fine-tuning experiments on four public conversation datasets, we show that for replies to the initial turn of a dialog, an LLM with 64B parameters is able to exactly expand over 70% of phrases with abbreviation length up to 10, leading to an effective keystroke saving rate of up to about 77% on these exact expansions. Including a small amount of context in the form of a single conversation turn more than doubles abbreviation expansion accuracies compared to having no context, an effect that is more pronounced for longer phrases. Additionally, the robustness of models against typo noise can be enhanced through fine-tuning on noisy data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题