Skip to content

This is office PyTorch implementation for Vermouth: Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors that was accepted by IJCAI2024.

Notifications You must be signed in to change notification settings

develop-productivity/Vermouth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vermouth: Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors | IJCAI2024

This is office PyTorch implementation for Vermouth that was accepted by IJCAI2024.

Vermouth is a simple yet effective framework to migrate the diffusion model to non-generated tasks, which comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors

alt text

Results

ZS-SBIR

Model Sketchy TU_Berlin QuickDraw
MAE-L 39.23 41.99 11.71
BeiTv3-G 54.54 50.93 13.67
Swinv2-L 43.39 45.51 12.08
DINO-B 38.51 25.49
Vermouth 56.8 52.83 15.11

OV Semantic Segmentation

Model ADE-150 PC-59 VOC20 ADE-847 PC-459
MAE-L 17.5 53.27 93.51 3.42 8.82
ConvNeXt-L 18.65 53.42 94.62 3.53 9.53
Swinv2-L 18.8 53.37 94.76 3.8 9.42
DINO-B 17.13 47.84 92.44 3.16 7.75
Vermouth 19.0 52.88 92.87 3.7 9.0

Few-shot classification

Model OxfordPets Flowers102 FGVCAircraft DTD EuroSAT StanfordCars Food101 SUN397 Caltech101 UF101 ImageNet
MAE-L 91.87 92.04 36.51 63.74 87.39 24.15 59.31 62.08 94.45 76.55 39.74
BeiTv3-G 93.79 97.84 38.34 72.41 86.11 62.58 74.42 71.57 96.9 84.38 86.95
Swinv2-L 89.65 99.61 29.13 73.1 86.9 37.75 77.41 72.63 97.01 81.06 78.84
DINO-B 89.32 97.82 48.3 69 91.15 57.17 58.5 62.44 95.57 76.97 67.66
Vermouth 66.13 92.35 42.52 66.62 88.93 51.05 45.78 58.09 95.83 70.49 55.89

Quick Start

You should clone thos repo and create a python env

git clone https://github.com/develop-productivity/Vermouth.git
cd Vermouth
conda create -n env_name python=3.10
pip install -r requirments.txt

datasets

You can follow these repo to download the datasets

You should originize you file as datasets dir

TODO

  • release core code
  • release the web page
  • release model pre-train weight

cite

If you find our work useful in your research, please consider citing:

@article{dong2024bridging,
  title={Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors},
  author={Dong, Shiyin and Zhu, Mingrui and Cheng, Kun and Wang, Nannan and Gao, Xinbo},
  journal={arXiv preprint arXiv:2401.16459},
  year={2024}
}

About

This is office PyTorch implementation for Vermouth: Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors that was accepted by IJCAI2024.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published