Regclr：野外表达式学习的自我监督框架

论文标题

Regclr：野外表达式学习的自我监督框架

RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild

论文作者

Wang, Weiyao, Kim, Byung-Hak, Ganapathi, Varun

论文摘要

使用大型模型从自然图像中学习视觉表示的自我监管学习（SSL）的最新进展正在迅速缩小由完全监督的学习产生的结果与SSL在下游视觉任务上产生的结果之间的差距。受到这一进步的启发，主要是由表格和结构化文档图像应用的出现的启发，我们研究了哪些自我监管的预处理目标，架构和微调策略最有效。为了解决这些问题，我们介绍了RegClr，这是一个新的自我监督框架，结合了对比度和正则化方法，并且与标准视觉变压器体系结构兼容。然后，通过将蒙版自动编码器集成为对比方法的代表性示例，并增强Barlow Twins作为正规化方法的代表性示例，作为正规化方法的代表性示例，将REGCLR实例化。从标准单词和乳胶文档到更具挑战性的电子健康记录（EHR）计算机屏幕屏幕图像，从文档图像中提取表格（例如，从文档图像中提取表）的几个真实世界表识别场景（例如，从该新框架中吸取的表示形式受益匪浅在现实世界中的EHR屏幕图像上。

Recent advances in self-supervised learning (SSL) using large models to learn visual representations from natural images are rapidly closing the gap between the results produced by fully supervised learning and those produced by SSL on downstream vision tasks. Inspired by this advancement and primarily motivated by the emergence of tabular and structured document image applications, we investigate which self-supervised pretraining objectives, architectures, and fine-tuning strategies are most effective. To address these questions, we introduce RegCLR, a new self-supervised framework that combines contrastive and regularized methods and is compatible with the standard Vision Transformer architecture. Then, RegCLR is instantiated by integrating masked autoencoders as a representative example of a contrastive method and enhanced Barlow Twins as a representative example of a regularized method with configurable input image augmentations in both branches. Several real-world table recognition scenarios (e.g., extracting tables from document images), ranging from standard Word and Latex documents to even more challenging electronic health records (EHR) computer screen images, have been shown to benefit greatly from the representations learned from this new framework, with detection average-precision (AP) improving relatively by 4.8% for Table, 11.8% for Column, and 11.1% for GUI objects over a previous fully supervised baseline on real-world EHR screen images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题