论文标题
使用场景图的语义图像操纵
Semantic Image Manipulation Using Scene Graphs
论文作者
论文摘要
图像操作可以被视为图像生成的特殊情况,其中要产生的图像是对现有图像的修改。在大多数情况下,图像产生和操纵是在原始像素上运行的任务。但是,学习丰富的图像和对象表示方面的显着进步为主要由语义驱动的文本对图像或布局到图像生成等任务开辟了道路。在我们的工作中,我们从场景图中解决了图像操纵的新颖问题,在该问题中,用户可以通过仅应用从图像生成的语义图的节点或边缘的更改来编辑图像。我们的目标是在给定的星座中对图像信息进行编码,并在生成新的星座上从那里进行编码,例如更换对象,甚至更改对象之间的关系,同时尊重原始图像的语义和样式。我们引入了空间语义场景图网络,该网络不需要直接监督星座更改或图像编辑。这使得可以从现有现实世界数据集中训练系统而没有其他注释工作。
Image manipulation can be considered a special case of image generation where the image to be produced is a modification of an existing image. Image generation and manipulation have been, for the most part, tasks that operate on raw pixels. However, the remarkable progress in learning rich image and object representations has opened the way for tasks such as text-to-image or layout-to-image generation that are mainly driven by semantics. In our work, we address the novel problem of image manipulation from scene graphs, in which a user can edit images by merely applying changes in the nodes or edges of a semantic graph that is generated from the image. Our goal is to encode image information in a given constellation and from there on generate new constellations, such as replacing objects or even changing relationships between objects, while respecting the semantics and style from the original image. We introduce a spatio-semantic scene graph network that does not require direct supervision for constellation changes or image edits. This makes it possible to train the system from existing real-world datasets with no additional annotation effort.