论文标题
重新思考流量对象检测的检测头配置
Rethinking the Detection Head Configuration for Traffic Object Detection
论文作者
论文摘要
多尺度检测在对象检测模型中起重要作用。但是,研究人员通常会对如何合理配置检测头组合不同输入分辨率的多尺度功能感到空白。我们发现,在不同输入分辨率下,对象分布与检测头之间存在不同的匹配关系。基于启发性的发现,我们基于检测头和对象分布之间的匹配(称为MHD-net)提出了一个轻巧的流量对象检测网络。它由三个主要部分组成。第一个是检测头和对象分布匹配策略,它指导检测头的合理配置,以利用多尺度特征来有效地检测对象在截然不同的尺度上。第二个是跨尺度检测头配置指南,该指南指示只有两个具有丰富特征表示的检测头代替多个检测头,以在检测准确性,模型参数,拖船和检测速度之间取得良好的平衡。第三个是接受场扩大方法,它将扩张的卷积模块与骨干的浅特征结合在一起,以非常稍微稍微稍微稍微增加模型参数而进一步提高检测准确性。所提出的模型比BDD100K数据集和我们提出的ETFOD-V2数据集的其他模型更具竞争性能。代码将可用。
Multi-scale detection plays an important role in object detection models. However, researchers usually feel blank on how to reasonably configure detection heads combining multi-scale features at different input resolutions. We find that there are different matching relationships between the object distribution and the detection head at different input resolutions. Based on the instructive findings, we propose a lightweight traffic object detection network based on matching between detection head and object distribution, termed as MHD-Net. It consists of three main parts. The first is the detection head and object distribution matching strategy, which guides the rational configuration of detection head, so as to leverage multi-scale features to effectively detect objects at vastly different scales. The second is the cross-scale detection head configuration guideline, which instructs to replace multiple detection heads with only two detection heads possessing of rich feature representations to achieve an excellent balance between detection accuracy, model parameters, FLOPs and detection speed. The third is the receptive field enlargement method, which combines the dilated convolution module with shallow features of backbone to further improve the detection accuracy at the cost of increasing model parameters very slightly. The proposed model achieves more competitive performance than other models on BDD100K dataset and our proposed ETFOD-v2 dataset. The code will be available.