Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
image-to-text
clip
text-to-image
dit
multimodal
sora
text-to-video
aigc
stable-diffusion
controlnet
llava
minigpt4
sd-xl
ppdiffusers
eva-clip
stablevideodiffusion
qwen-vl
internvl2
unidiffuser
qwen2-vl
-
Updated
Nov 4, 2024 - Python