论文标题

Lila-Boti:通过订购孟加拉手写识别的老师见解来利用孤立的信件积累

LILA-BOTI : Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition

论文作者

Hossain, Md. Ismail, Rakib, Mohammed, Mollah, Sabbir, Rahman, Fuad, Mohammed, Nabeel

论文摘要

单词级手写的光学特征识别(OCR)仍然是像孟加拉这样的形态丰富的语言的挑战。复杂性源于大量字母,几种变节形式的存在以及复杂的结合的出现。由于某些素描很少发生但仍然必不可少的事实,难以加剧困难,因此解决阶级不平衡是为了令人满意的结果所必需的。本文通过引入两种知识蒸馏方法来解决这个问题:通过订购教师洞察力(Lila-Boti)和超级老师Lila-Boti来利用孤立的字母积累。在这两种情况下,卷积复发性神经网络(CRNN)学生模型均经过从印刷孤立的角色识别教师模型中获得的黑暗知识进行训练。我们对\ emph {bn-htrd}和\ emph {banglawriting}进行了跨数据测试作为我们的评估协议,因此设置了一个具有挑战性的问题,其中结果可以更好地反映未见数据的性能。与基本模型(无KD)和常规KD相比,我们的评估的F1-MaCro得分达到了3.5%,我们的总体单词识别率提高了4.5%。

Word-level handwritten optical character recognition (OCR) remains a challenge for morphologically rich languages like Bangla. The complexity arises from the existence of a large number of alphabets, the presence of several diacritic forms, and the appearance of complex conjuncts. The difficulty is exacerbated by the fact that some graphemes occur infrequently but remain indispensable, so addressing the class imbalance is required for satisfactory results. This paper addresses this issue by introducing two knowledge distillation methods: Leveraging Isolated Letter Accumulations By Ordering Teacher Insights (LILA-BOTI) and Super Teacher LILA-BOTI. In both cases, a Convolutional Recurrent Neural Network (CRNN) student model is trained with the dark knowledge gained from a printed isolated character recognition teacher model. We conducted inter-dataset testing on \emph{BN-HTRd} and \emph{BanglaWriting} as our evaluation protocol, thus setting up a challenging problem where the results would better reflect the performance on unseen data. Our evaluations achieved up to a 3.5% increase in the F1-Macro score for the minor classes and up to 4.5% increase in our overall word recognition rate when compared with the base model (No KD) and conventional KD.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源