论文标题

通过特征生成基于知识的学习

Knowledge-Based Learning through Feature Generation

论文作者

Badian, Michal, Markovitch, Shaul

论文摘要

机器学习算法很难概括一小部分示例。人类可以通过利用自己拥有的大量背景知识来执行这样的任务。通过外部知识增强学习算法的一种方法是通过特征产生。在本文中,我们介绍了一种新算法,用于基于辅助数据集的集合来生成功能。我们假设,除了培训集外,我们还可以访问其他数据集。与转移学习设置不同,我们不认为辅助数据集代表与我们原始的任务相似的学习任务。该算法找到训练集和辅助数据集的功能。基于这些功能和辅助数据集的示例,它诱导了辅助数据集对新功能的预测指标。然后将诱导的预测变量添加到原始训练集中,作为生成的功能。我们的方法对各种学习任务进行了测试,包括文本分类和医学预测,并且仅使用给定功能显示出显着改善。

Machine learning algorithms have difficulties to generalize over a small set of examples. Humans can perform such a task by exploiting vast amount of background knowledge they possess. One method for enhancing learning algorithms with external knowledge is through feature generation. In this paper, we introduce a new algorithm for generating features based on a collection of auxiliary datasets. We assume that, in addition to the training set, we have access to additional datasets. Unlike the transfer learning setup, we do not assume that the auxiliary datasets represent learning tasks that are similar to our original one. The algorithm finds features that are common to the training set and the auxiliary datasets. Based on these features and examples from the auxiliary datasets, it induces predictors for new features from the auxiliary datasets. The induced predictors are then added to the original training set as generated features. Our method was tested on a variety of learning tasks, including text classification and medical prediction, and showed a significant improvement over using just the given features.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源