论文标题
图像字幕的超参数分析
Hyperparameter Analysis for Image Captioning
论文作者
论文摘要
在本文中,我们使用两种不同的架构:CNN+LSTM和CNN+变压器对最先进的图像字幕方法进行了彻底的灵敏度分析。使用FlickR8K数据集进行了实验。实验的最大收获是,对CNN编码器进行微调优于基线和所有其他针对这两种体系结构进行的实验。
In this paper, we perform a thorough sensitivity analysis on state-of-the-art image captioning approaches using two different architectures: CNN+LSTM and CNN+Transformer. Experiments were carried out using the Flickr8k dataset. The biggest takeaway from the experiments is that fine-tuning the CNN encoder outperforms the baseline and all other experiments carried out for both architectures.