Skip to content

Latest commit

 

History

History
69 lines (45 loc) · 3.22 KB

README.md

File metadata and controls

69 lines (45 loc) · 3.22 KB

FISEdit

This is the original code for "Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference" (AAAI 2024).

Prepare the Environment

Build Hetu

Hetu is our self-developed sparse inference engine, you have to build it first according to README.md in https://github.com/Hankpipi/Hetu/tree/diffusers.

Build Diffusers

Under the virtual environment created before, you can then install the diffusers of hetu version by the following commands.

pip install -e .

Note that if you want to examine some benchmarks (SDTP, SDIP, SDEdit and InstructPix2Pix), you need to use the original diffusers by creating a new environment diffusers using the following command.

conda install -c conda-forge diffusers

Run the Scripts

Run Our Pipelines

You can move diffusers-hetu/normal_pipeline.py and diffusers-hetu/edit_pipeline.py to your workspace, and simply run the following commands under the hetu_diffusers virtual environment.

python normal_pipeline.py
python edit_pipeline.py --run_sample 1

The former is the normal text-to-image pipeline, and the latter is the image-edit pipeline. For the latter, you can run the script under different settings (single editing, continuous editing or run the whole dataset). For more details, you can dig into edit_pipeline.py.

Generate the Dataset

Our supplemental data is stored at 101nobody/FISEdit-Data: The supplemental data for FISEdit (github.com). It consists of the initial random seeds and the masks generated by our methods, which could effectively reproduce part of the results in our paper. To use it, you should first mkdir data under your workspace. Then, you could download the repo and move the mask and random_seed directories to your data directory. It is also recommended to build your own dataset and generate the masks using the method mentioned in our paper.

On top of that, you need to download the text-pairs from Index of / (berkeley.edu), putting gpt-generated-prompts.jsonl in your workspace.

Then, after create a dataset directory under your workspace, you could run edit_pipeline.py directly while setting it to run_dataset mode, which will generate pairs of images and store them in the dataset.

mkdir dataset
cd dataset && mkdir hetu_origin && mkdir hetu_edit && cd ..
python edit_pipeline.py --run_dataset 1

Benchmarks

All benchmarks are shown in benchmark directory. In benchmark/diff, there are scripts of generating masks from difference maps and analyzing mask rates. In benchmark/baseline, there are scripts of different image-edit pipelines (you need to run them under the original diffusers virtual environment). And the scripts used for evaluation are listed in benchmark/eval directory.

Citation

@inproceedings{yu2024fisedit,
  title={Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference},
  author={Yu, Zihao and Li, Haoyang and Fu, Fangcheng and Miao, Xupeng and Cui, Bin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={15},
  pages={16605--16613},
  year={2024}
}