Official Pytorch implementation for our paper Semantic Distance Adversarial Learning for Text-to-Image Synthesis
- python 3.8
- Pytorch 1.9
- transformers 4.8.1
Clone this repo.
git clone https://github.com/yuanrr/SEMA
conda create -n SEMA
conda activate SEMA
pip install -r requirements.txt
- Download the preprocessed metadata for birds coco and extract them to
data/
- Download the birds image data. Extract them to
data/birds/
- Download coco2014 dataset and extract the images to
data/coco/images/
Code for training SEMA will be released soon. Hope to get your continued attention.
- SEMA w/o BERT for coco (password: guvx)
We synthesize about 30k images from the test descriptions and evaluate the FID between synthesized images and test images of each dataset.
- synthesize images by the given pretrained model
python sampling.py
- evaluate the FID score
python test_fid.py
The released model achieves better performance than SEMA paper version.
Model | COCO-FID↓ |
---|---|
SEMA w/o BERT (paper) | 17.51 |
SEMA w/o BERT (released model) | ~16.5 |
SEMA (paper) | 16.31 |
The code is released for academic research use only. Please contact us if you have any questions. Bowen Yuan
Reference