论文标题

MP-codecheck:不断发展的逻辑表达式代码异常学习与迭代自学

MP-CodeCheck: Evolving Logical Expression Code Anomaly Learning with Iterative Self-Supervision

论文作者

Muff, Urs C., Lee, Celine, Gottschlich, Paul, Gottschlich, Justin

论文摘要

机器编程(MP)与自动化软件开发有关。根据研究,软件工程师将其开发时间的50%超过50%。为了帮助加速调试,我们提出MP-Codecheck(MPCC)。 MPCC是一个MP系统,它试图在逻辑程序表达式中识别异常代码模式。在设计MPCC时,我们开发了两种新颖的编程语言表示,其形态对于详尽有效地处理其自我监督培训中使用的数十亿条代码的能力至关重要。为了量化MPCC的性能,我们将其与ControlFlag进行了比较,ControlFlag是一种最新的自我监管代码异常检测系统;我们发现MPCC在空间和时间上更有效。我们通过在各种开源GitHub存储库和一个专有代码基础上行使MPCC的异常代码检测功能。我们还对MPCC可以检测到的一些不同类别的代码异常提供了简短的定性研究,以提供对其能力的缩写见解。

Machine programming (MP) is concerned with automating software development. According to studies, software engineers spend upwards of 50% of their development time debugging software. To help accelerate debugging, we present MP-CodeCheck (MPCC). MPCC is an MP system that attempts to identify anomalous code patterns within logical program expressions. In designing MPCC, we developed two novel programming language representations, the formations of which are critical in its ability to exhaustively and efficiently process the billions of lines of code that are used in its self-supervised training. To quantify MPCC's performance, we compare it against ControlFlag, a state-of-the-art self-supervised code anomaly detection system; we find that MPCC is more spatially and temporally efficient. We demonstrate MPCC's anomalous code detection capabilities by exercising it on a variety of open-source GitHub repositories and one proprietary code base. We also provide a brief qualitative study on some of the different classes of code anomalies that MPCC can detect to provide an abbreviated insight into its capabilities.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源