论文标题

Fincat:财务数字索赔分析工具

FiNCAT: Financial Numeral Claim Analysis Tool

论文作者

Ghosh, Sohom, Naskar, Sudip Kumar

论文摘要

通过阅读财务文件做出投资决策,投资者需要区分索赔和索赔数字。在本文中,我们提出了一种自动执行的工具。它使用基于变压器的预训练的语言模型称为BERT来提取数字的上下文嵌入。此后,它使用基于逻辑回归的模型来检测数字是索赔或删节的。我们使用Finnum-3(英语)数据集来训练我们的模型。进行严格的实验后,我们在验证集上达到了0.8223的宏F1得分。我们已经开源此工具,可以从https://github.com/sohomghosh/fincat_financial_numeral_claim_analsisy_tool访问它

While making investment decisions by reading financial documents, investors need to differentiate between in-claim and outof-claim numerals. In this paper, we present a tool which does it automatically. It extracts context embeddings of the numerals using one of the transformer based pre-trained language model called BERT. After this, it uses a Logistic Regression based model to detect whether the numerals is in-claim or out-of-claim. We use FinNum-3 (English) dataset to train our model. After conducting rigorous experiments we achieve a Macro F1 score of 0.8223 on the validation set. We have open-sourced this tool and it can be accessed from https://github.com/sohomghosh/FiNCAT_Financial_Numeral_Claim_Analysis_Tool

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源