匿名痤疮面部数据集生成的生成对抗网络生成

论文标题

匿名痤疮面部数据集生成的生成对抗网络生成

Generative Adversarial Networks for anonymous Acneic face dataset generation

论文作者

Zein, Hazem, Chantaf, Samer, Fournier, Régis, Nait-Ali, Amine

论文摘要

众所周知，如果用于培训过程的数据集和测试过程满足某些特定要求，则任何分类模型的性能都是有效的。换句话说，数据集大小越大，平衡且代表性的越多，人们就越能相信所提出的模型的有效性，因此获得了所获得的结果。不幸的是，大型匿名数据集通常在生物医学应用中公开可用，尤其是那些处理病理人脸图像的数据集。这种关注使得使用基于深度学习的方法来挑战部署，并且难以复制或验证一些已发表的结果。在本文中，我们提出了一种有效的方法，可以生成人体面部的匿名合成数据集，其痤疮疾病的属性对应于三个级别的严重程度（即轻度，中度和重度）。因此，考虑了以不同级别训练的基于样式的特定层次结构算法。为了评估所提出的方案的性能，我们考虑了一个基于CNN的分类系统，该系统使用生成的合成痤疮面部图像训练，并使用真实的面部图像进行了测试。因此，我们表明使用InceptionResnetv2实现了97,6 \％的精度。结果，这项工作允许科学界使用生成的合成数据集进行任何数据处理应用程序，而无需限制法律或道德问题。此外，这种方法也可以扩展到需要生成合成医学图像的其他应用程序。我们可以使科学界可以访问代码和生成的数据集。

It is well known that the performance of any classification model is effective if the dataset used for the training process and the test process satisfy some specific requirements. In other words, the more the dataset size is large, balanced, and representative, the more one can trust the proposed model's effectiveness and, consequently, the obtained results. Unfortunately, large-size anonymous datasets are generally not publicly available in biomedical applications, especially those dealing with pathological human face images. This concern makes using deep-learning-based approaches challenging to deploy and difficult to reproduce or verify some published results. In this paper, we suggest an efficient method to generate a realistic anonymous synthetic dataset of human faces with the attributes of acne disorders corresponding to three levels of severity (i.e. Mild, Moderate and Severe). Therefore, a specific hierarchy StyleGAN-based algorithm trained at distinct levels is considered. To evaluate the performance of the proposed scheme, we consider a CNN-based classification system, trained using the generated synthetic acneic face images and tested using authentic face images. Consequently, we show that an accuracy of 97,6\% is achieved using InceptionResNetv2. As a result, this work allows the scientific community to employ the generated synthetic dataset for any data processing application without restrictions on legal or ethical concerns. Moreover, this approach can also be extended to other applications requiring the generation of synthetic medical images. We can make the code and the generated dataset accessible for the scientific community.

下载PDF全文

下载文献需遵守相关版权规定

论文标题