SD3.5-large (8B) support #442

stduhpf · 2024-10-22T15:36:28Z

Stable Diffusion 3.5 Large and Large Turbo just got released publicly.
https://huggingface.co/stabilityai/stable-diffusion-3.5-large
https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo

Inference code here (warning: weird licence): https://github.com/Stability-AI/sd3.5

It's a model that should perform fairly well (SD3-Large is ranked slightly above Flux Schnell on artificialanalysis areana leaderboard, and this is an upgraded version of SD3-Large), while being smaller than Flux (it's has 8B parameters).

Right now, these two models are not supported by sdcpp (I tried).

What's required:

Differentiate SD3 2B and SD3.5 Large/Large Turbo
Add --clip_g argument: Add --clip_g argument and support split SD3 2B models (for SD3.5 support) #444
Fix k quantization (q3_k I generated is 13.5 GB, while the full fp16 model is 16GB)
Maybe more?

Sidenote: SD3.5 Medium (2B) is also going to be released soon, hopefully it will work as a drop-in replacement for SD3 2B

Edit: About quantization, the majority of tensors in sd3.5 large do not fit nicely in a whole number of blocks of size 256, so they are skipped when trying to quantize to q3_k, q4_k and so on.

The text was updated successfully, but these errors were encountered:

stduhpf · 2024-10-23T15:08:22Z

I noticed there are some slight differences between sd3 and sd3.5 architecture diagrams. Not sure if this can cause problems.

SD3 (2B)	SD3.5 (8B)

Text embeddings are now 77+77/256 tokens instead of 77+77 tokens (not sure what "/256" means here, it's probably not a division)
And the RMS norm before attention in the DiT block is no longer optionnal.

leejet · 2024-10-24T14:04:43Z

It's currently supported, see #445.

razvanab · 2024-10-24T17:39:38Z

It does support these 2 too?

https://huggingface.co/city96/stable-diffusion-3.5-large-turbo-gguf

https://huggingface.co/city96/stable-diffusion-3.5-large-gguf

stduhpf · 2024-10-24T17:50:35Z

@razvanab it should work. I don't see a reason why they wouldn't be supported.

razvanab · 2024-10-24T18:10:08Z

It does nothing; it just goes to the cmd prompt again.

➜ .\sd.exe -m  "J:\LLM_MODELS\SD\sd3.5_large_turbo-Q5_0.gguf" --clip_l "J:\LLM_MODELS\SD\clip\clip_l.safetensors" --clip_g "J:\LLM_MODELS\SD\clip\clip_vision_g.safetensors" --t5xxl "J:\LLM_MODELS\SD\clip\t5-v1_1-xxl-encoder-Q5_K_M.gguf"  -H 1024 -W 1024 -p "a lovely cat " --cfg-scale 4.5 --sampling-method euler --verbose
Option:
    n_threads:         8
    mode:              txt2img
    model_path:        J:\LLM_MODELS\SD\sd3.5_large_turbo-Q5_0.gguf
    wtype:             unspecified
    clip_l_path:       J:\LLM_MODELS\SD\clip\clip_l.safetensors
    clip_g_path:       J:\LLM_MODELS\SD\clip\clip_vision_g.safetensors
    t5xxl_path:        J:\LLM_MODELS\SD\clip\t5-v1_1-xxl-encoder-Q5_K_M.gguf
    diffusion_model_path:
    vae_path:
    taesd_path:
    esrgan_path:
    controlnet_path:
    embeddings_path:
    stacked_id_embeddings_path:
    input_id_images_path:
    style ratio:       20.00
    normalize input image :  false
    output_path:       output.png
    init_img:
    control_image:
    clip on cpu:       false
    controlnet cpu:    false
    vae decoder on cpu:false
    strength(control): 0.90
    prompt:            a lovely cat
    negative_prompt:
    min_cfg:           1.00
    cfg_scale:         4.50
    guidance:          3.50
    clip_skip:         -1
    width:             1024
    height:            1024
    sample_method:     euler
    schedule:          default
    sample_steps:      20
    strength(img2img): 0.75
    rng:               cuda
    seed:              42
    batch_count:       1
    vae_tiling:        false
    upscale_repeats:   1
System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
[DEBUG] stable-diffusion.cpp:159  - Using CUDA backend
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes
[INFO ] stable-diffusion.cpp:197  - loading model from 'J:\LLM_MODELS\SD\sd3.5_large_turbo-Q5_0.gguf'
[INFO ] model.cpp:801  - load J:\LLM_MODELS\SD\sd3.5_large_turbo-Q5_0.gguf using gguf format
[DEBUG] model.cpp:818  - init from 'J:\LLM_MODELS\SD\sd3.5_large_turbo-Q5_0.gguf'
[INFO ] stable-diffusion.cpp:204  - loading clip_l from 'J:\LLM_MODELS\SD\clip\clip_l.safetensors'
[INFO ] model.cpp:804  - load J:\LLM_MODELS\SD\clip\clip_l.safetensors using safetensors format
[DEBUG] model.cpp:872  - init from 'J:\LLM_MODELS\SD\clip\clip_l.safetensors'
[INFO ] stable-diffusion.cpp:211  - loading clip_g from 'J:\LLM_MODELS\SD\clip\clip_vision_g.safetensors'
[INFO ] model.cpp:804  - load J:\LLM_MODELS\SD\clip\clip_vision_g.safetensors using safetensors format
[DEBUG] model.cpp:872  - init from 'J:\LLM_MODELS\SD\clip\clip_vision_g.safetensors'
[INFO ] stable-diffusion.cpp:218  - loading t5xxl from 'J:\LLM_MODELS\SD\clip\t5-v1_1-xxl-encoder-Q5_K_M.gguf'
[INFO ] model.cpp:801  - load J:\LLM_MODELS\SD\clip\t5-v1_1-xxl-encoder-Q5_K_M.gguf using gguf format
[DEBUG] model.cpp:818  - init from 'J:\LLM_MODELS\SD\clip\t5-v1_1-xxl-encoder-Q5_K_M.gguf'
[INFO ] stable-diffusion.cpp:244  - Version: SD3.5 8B
[INFO ] stable-diffusion.cpp:275  - Weight type:                 q5_0
[INFO ] stable-diffusion.cpp:276  - Conditioner weight type:     f16

stduhpf · 2024-10-24T18:29:33Z

@razvanab I can confirm, it doesn't work. You'll have to quantize it yourself with sdcpp, or wait for someone else to do it and upload the models to Huggingface.

razvanab · 2024-10-24T19:00:40Z

I see.
Thanks

razvanab · 2024-10-24T19:22:25Z

Ok, now I get this error. I should probably wait for someone who knows what is doing to quantize it.

Error.txt

I have quantize t5xxl too and that get rid of some error too. But i still get a lot of error for:

clip_g.safetensors

Nevermind, I was stupid for not getting the correct clip_g.safetensors file.

Sorry about this.

stduhpf · 2024-10-25T14:53:03Z

@razvanab I'm uploading some here if you want: https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp

razvanab · 2024-10-25T15:01:20Z

I did the same last night, but I forgot to post it here.
If you want, you can take the t5xxl model from there and post it on your repo.

https://huggingface.co/razvanab/SDCpp

stduhpf · 2024-10-25T15:16:13Z

@razvanab Btw you can find more compatible t5xxl quants and clip-l quants here: https://huggingface.co/Green-Sky/flux.1-schnell-GGUF/tree/main.

razvanab · 2024-10-25T22:45:56Z

Oh, nice t5xxl q8_0 under 6GB
for some reason for me q8_0 ended up being over 6 GB.

Thanks.

stduhpf changed the title ~~SD3.5-large support~~ SD3.5-large (8B) support Oct 22, 2024

stduhpf closed this as completed Oct 24, 2024

stduhpf mentioned this issue Oct 26, 2024

Segmentation fault for 3.5-large-turbo #449

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SD3.5-large (8B) support #442

SD3.5-large (8B) support #442

stduhpf commented Oct 22, 2024 •

edited

Loading

stduhpf commented Oct 23, 2024

leejet commented Oct 24, 2024

razvanab commented Oct 24, 2024

stduhpf commented Oct 24, 2024

razvanab commented Oct 24, 2024

stduhpf commented Oct 24, 2024

razvanab commented Oct 24, 2024

razvanab commented Oct 24, 2024 •

edited

Loading

stduhpf commented Oct 25, 2024

razvanab commented Oct 25, 2024

stduhpf commented Oct 25, 2024

razvanab commented Oct 25, 2024

SD3.5-large (8B) support #442

SD3.5-large (8B) support #442

Comments

stduhpf commented Oct 22, 2024 • edited Loading

What's required:

stduhpf commented Oct 23, 2024

leejet commented Oct 24, 2024

razvanab commented Oct 24, 2024

stduhpf commented Oct 24, 2024

razvanab commented Oct 24, 2024

stduhpf commented Oct 24, 2024

razvanab commented Oct 24, 2024

razvanab commented Oct 24, 2024 • edited Loading

stduhpf commented Oct 25, 2024

razvanab commented Oct 25, 2024

stduhpf commented Oct 25, 2024

razvanab commented Oct 25, 2024

stduhpf commented Oct 22, 2024 •

edited

Loading

razvanab commented Oct 24, 2024 •

edited

Loading