PromptGIP [ICML2024]

Unifying Image Processing as Visual Prompting Question Answering

This paper was accepted by The Forty-first International Conference on Machine Learning (ICML2024). [Arxiv]

We propose a universal model for general image processing that covers image restoration, image enhancement, image feature extraction tasks, etc. PromptGIP can undertake diverse cross-domain tasks using provided visual prompts, eliminating the need for task-specific finetuning. Capable of handling up to 15 different image processing tasks, PromptGIP represents a versatile and adaptive approach to general image processing.

News

[2024/6/17] ⚡ We have released the codes! Please refer to the following instructions.

Preparation

Datasets

Image Restoration

For Gaussian noise, Gaussian blur, Poisson noise, salt & pepper noise, jpeg compression, ringing artifacts, R-L algorithm, inpainting, we directly synthesize the corresponding distortions on the ImageNet dataset to create degraded-clean pairs. We collect a composed dataset (Common528) for testing, which consists of commonly-used datasets: Set5, Set14, BSDS100, Manga109, Urban100, General100, and DIV2K-Valid.
For dehazing, we utilize the ITS training set of RESIDE dataset for training and SOTS-indoor for testing.
For rain removal, we employ two types of rain addition models: Simple Rain Model and Complex Rain Model. The former is a simple additive rain model synthesized on the ImageNet dataset, and we use Common528 for testing. The latter utilizes Rain13K, including an assortment of diverse rain models, while we adopt Test100 dataset for testing.

Image Enhancement

For low-light image enhancement (LLE) task, the LOL dataset is adopted for training and testing.
For local Laplacian filtering (LLF), we apply local Laplacian filter on the expert-C retouched images of Adobe-MIT Fivek dataset for training and testing, forming the requisite input-output pairs.

Image Edge Detection

Two acknowledged image edge detection operators, the Canny and Laplacian operators, are investigated. The ImageNet dataset forms the basis for creating input-output training pairs. For testing, we adopt Common528 dataset.

❗ Note that all the test datasets are resized to 256 × 256 for convenience.

Test Datasets (256 x 256)	Tasks	Link
Common528	Gaussian noise, Gaussian blur, Poisson noise, salt & pepper noise, jpeg compression, ringing artifacts, R-L algorithm, inpainting, simple deraining, Canny operator, Laplacian operator	Baidu Disk (code: ca3y)
SOTS-indoor	Dehazing	Baidu Disk (code: thgt)
Test100	Complex deraining	Baidu Disk (code: m28i)
LOL	Low-light image enhancement	Baidu Disk (code: jzi6)
LLF	Local Laplacian filtering	Baidu Disk (code: 6sme)

Pretrained Models

Please download the pretrained model at Baidu Disk (code: ycjg). Put the checkpoint file in the pretrained_models folder.

Quick Inference

Modify the dataset paths in test_PromptGIP_customized.py.
To quickly reproduce the results reported in the paper, you can directly execute run_test_batch_PromptGIP_reproduce.sh.

bash run_test_batch_PromptGIP_reproduce.sh

In prompt learning, the provided prompt may influence the results. To explore the effects of different prompts, you can change different prompts for testing. We provide a simple bash to help try different prompts.

bash run_test_batch_PromptGIP_loop.sh

Training

If you want to train PromptGIP, you can execute the follow script.

bash run_train_PromptGIP.sh

You may prepare the training datasets as described in the paper.

Citation

If you find our work is useful, please kindly cite it.

@article{liu2023unifying,
  title={Unifying image processing as visual prompting question answering},
  author={Liu, Yihao and Chen, Xiangyu and Ma, Xianzheng and Wang, Xintao and Zhou, Jiantao and Qiao, Yu and Dong, Chao},
  journal={arXiv preprint arXiv:2310.10513},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
dataset		dataset
evaluate		evaluate
experiments		experiments
pretrained_models		pretrained_models
util		util
.gitignore		.gitignore
CNN_Head.py		CNN_Head.py
README.md		README.md
engine_pretrain.py		engine_pretrain.py
evaluation_prompt.py		evaluation_prompt.py
loss.py		loss.py
models_mae_PromptGIP_CNN_Head.py		models_mae_PromptGIP_CNN_Head.py
run_test_batch_PromptGIP_loop.sh		run_test_batch_PromptGIP_loop.sh
run_test_batch_PromptGIP_reproduce.sh		run_test_batch_PromptGIP_reproduce.sh
run_train_PromptGIP.sh		run_train_PromptGIP.sh
teaser.jpg		teaser.jpg
test_PromptGIP_customized.py		test_PromptGIP_customized.py
train_PromptGIP.py		train_PromptGIP.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptGIP [ICML2024]

Unifying Image Processing as Visual Prompting Question Answering

News

Preparation

Datasets

Pretrained Models

Quick Inference

Training

Citation

About

Releases

Packages

Languages

lyh-18/PromptGIP

Folders and files

Latest commit

History

Repository files navigation

PromptGIP [ICML2024]

Unifying Image Processing as Visual Prompting Question Answering

News

Preparation

Datasets

Pretrained Models

Quick Inference

Training

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages