论文标题

在线相似性学习与发票订单项匹配的反馈

Online Similarity Learning with Feedback for Invoice Line Item Matching

论文作者

Maurya, Chandresh Kumar, Gantayat, Neelamadhav, Dechu, Sampath, Horvath, Tomas

论文摘要

大型企业中支付流程(P2P)的采购是一个后端业务流程,涉及为企业运营的产品和服务采购。采购是通过向不稳定供应商发出采购订单来完成的,并且供应商通过严格的验证过程后,供应商提交的发票将支付。精心策划P2P流程的代理商经常遇到与购买订单中的产品或服务描述相匹配的问题,并验证是否已提供或提供订单的物品。例如,发票和采购订单中的描述可以是TRES 739ml CD KER平滑和TRES 0.739L CD KER SMTH,它们在单词级别上看起来有所不同,但请参考同一项目。在典型的P2P过程中,要求代理商在发布发票之前手动选择相似的产品。业务流程的这一步骤是手动,重复,繁琐且昂贵的。由于描述不是形成良好的句子,因此我们不能直接应用现有的语义和句法文本相似性方法。在本文中,我们提出了两种方法,可以使用各种可用代理的记录反馈数据来解决上述问题。如果代理的反馈是描述之间相对排名的形式,则使用相似性排名算法。如果代理的反馈是绝对的,例如匹配或不匹配,我们使用分类相似性算法。我们还提出了对我们方法有效性的威胁,并提出了利用产品分类法和目录的可能补救措施。我们展示了在许多基准和现实世界数据集上提出的方法的比较有效性和效率。

The procure to pay process (P2P) in large enterprises is a back-end business process which deals with the procurement of products and services for enterprise operations. Procurement is done by issuing purchase orders to impaneled vendors and invoices submitted by vendors are paid after they go through a rigorous validation process. Agents orchestrating P2P process often encounter the problem of matching a product or service descriptions in the invoice to those in purchase order and verify if the ordered items are what have been supplied or serviced. For example, the description in the invoice and purchase order could be TRES 739mL CD KER Smooth and TRES 0.739L CD KER Smth which look different at word level but refer to the same item. In a typical P2P process, agents are asked to manually select the products which are similar before invoices are posted for payment. This step in the business process is manual, repetitive, cumbersome, and costly. Since descriptions are not well-formed sentences, we cannot apply existing semantic and syntactic text similarity approaches directly. In this paper, we present two approaches to solve the above problem using various types of available agent's recorded feedback data. If the agent's feedback is in the form of a relative ranking between descriptions, we use similarity ranking algorithm. If the agent's feedback is absolute such as match or no-match, we use classification similarity algorithm. We also present the threats to the validity of our approach and present a possible remedy making use of product taxonomy and catalog. We showcase the comparative effectiveness and efficiency of the proposed approaches over many benchmarks and real-world data sets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源