GitHub - zxhxgithub/MML_Final: MML Final Project: Reproduce INITNO and some Attempts

Multimodal Final Project
_{Boosting Text-to-Image Diffusion Models via Initial Noise Optimization}

In this project we reproduce the results of the paper and propose some improvements at our attempt to boost the performance of and speed up the generation of text-to-image diffusion model InitNO. The detailed information can be found in our course report.

Getting started

Python libraries: You can use the following commands to create and activate your InitNO Python environment:

# Create conda environment
conda env create -f environment.yaml
# Activate conda environment
conda activate initno_env

Generating images: Run the following command to generate images.

python run_sd_initno.py

You can specify the following arguments in run_sd_initno.py:

SEEDS: a list of random seeds
PROMPT: text prompt for image generation
token_indices: a list of target token indices
result_root: path to save generated results

For Our Improvements, we provide the following arguments:

USE_CROSS_ATTN_CONFLICT_LOSS: whether to use the cross-attention conflict loss
OPT: assign the optimizer for the initial noise optimization, providing adam, adamw, rmsprop, sgd options

Acknowledgments

The code is built upon InitNO, and we adopt the official evaluation prompts from Attend and Excite. We thank the authors for open-sourcing.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
initno/pipelines		initno/pipelines
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
improvements.md		improvements.md
requirements.txt		requirements.txt
run_sd.py		run_sd.py
run_sd_initno.py		run_sd_initno.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Final Project
_{Boosting Text-to-Image Diffusion Models via Initial Noise Optimization}

Getting started

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

License

zxhxgithub/MML_Final

Folders and files

Latest commit

History

Repository files navigation

Multimodal Final ProjectBoosting Text-to-Image Diffusion Models via Initial Noise Optimization

Getting started

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Multimodal Final Project
_{Boosting Text-to-Image Diffusion Models via Initial Noise Optimization}

Packages