This repository contains the implementation of the AAAI 2024 paper:
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks [Paper] [AAAI]
Siyu Zou1, Jiji Tang2, Yiyi Zhou1, Jing He1, Chaoyi Zhao2, Rongsheng Zhang2, Zhipeng Hu2, Xiaoshuai Sun1
1Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
2Fuxi AI Lab, NetEase Inc., Hangzhou
There are four parts in the code.
- model: It contains the implement files for InstDiffEdit, DiffEdit and SDEdit.
- dataset_txt: It contains the data splits of Imagen, ImageNet and Editing-Mask dataset.
- dataset: It contains the image and mask of Editing-Mask dataset.
.sh
: The inference scripts for InstDiffEdit.
Python 3.8
PyTorch == 1.13.1
Transformers == 4.25.1
diffusers == 0.8.0
NumPy
- All experiments are performed with one A30 GPU.
There are two pdataset we used.
- ImageNet: We follow the evaluation protocol of FlexIT (https://github.com/facebookresearch/semanticimagetranslation). We obtained 1092 test images and made changes to the image category.
- Imagen: We use the 360 image with structured text prompts generated by Imagen(https://imagen.research.google/).
- Editing-Mask: 200 images show in dataset.
Sample begin:
bash sample_begin.sh
Run in the Imagen or ImageNet or Editing-Mask:
bash run.sh
Note:
- Diffedit and SDEdit can be used by the
.sh
file with some parameter changes. - you can open the
.sh
file for parameter modification.