Skip to content

Latest commit

 

History

History
106 lines (85 loc) · 4.17 KB

README.md

File metadata and controls

106 lines (85 loc) · 4.17 KB

Object-aware Inversion and Reassembly for Image Editing

Zhen Yang* · Ganggui Ding* · Wen Wang* · Hao Chen* · Bohan Zhuang† · Chunhua Shen*
*Zhejiang University      †Monash University

Paper PDF Project Page OIR-Bench Video

Setup

This code was tested with Python 3.9, Pytorch 2.0.1 using pre-trained models through huggingface / diffusers. Specifically, we implemented our method over Stable Diffusion 1.4. Additional required packages are listed in the requirements file. The code was tested on a NVIDIA GeForce RTX 3090 but should work on other cards.

Getting Started

  1. Download OIR-Bench.
  2. Create the environment and install the dependencies by running:
conda create -n oir python=3.9
conda activate oir
pip install -r requirements.txt
  1. Change the basic_config.py in configs/, change the model path and hyperparameters.
  2. Modify multi_object_edit.yaml or single_object_edit.yaml in configs/ according to multi_object.yaml and single_object.yaml in OIR-Bench/.
  3. Run single_object_edit.py (Search Metric in paper) or multi_object_edit.py (OIR in paper) to implement image editing.
  4. Option: Adjust reassembly_step and repeat the above process to get better results.

TODO

  1. Use prompt_change as dict's key may lead to error.
  2. Different editing pairs' masks mustn't have overlap.
  3. Search metric can be an ensemble learning tool. For example, we can use pnp, p2p, OIR ... method to edit an image and we can use search metric to select the optimal editing result.
  4. We can also use the method in TODO 3 to build a high quality dataset to train instruct-based image editing method.
  5. Deploy our method on different foundation model (SDXL, LCM ...)

Results

OIR results

Visualization of the search metric

Acknowlegment

Many thanks for the generous help in building the project website from Minghan Li.

License

For non-commercial academic use, this project is licensed under the 2-clause BSD License. For commercial use, please contact Chunhua Shen.

BibTeX

@inproceedings{yang2024objectaware,
title     = {Object-Aware Inversion and Reassembly for Image Editing},
author    = {Zhen Yang and Ganggui Ding and Wen Wang and Hao Chen and Bohan Zhuang and Chunhua Shen},
booktitle = {The Twelfth International Conference on Learning Representations},
year      = {2024},
url       = {https://openreview.net/forum?id=dpcVXiMlcv}
}