Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

traugdor
Copy link

Issue

As the title suggests, if the GPU runs out of VRAM during a VAEDecode event, then the entire queued prompt fails. With as many enhancements as ComfyUI has received over the last months, it is all but certain that this failure should no longer exist.

Solution

I have introduced a small change to the VAEDecode node so that it will fallback to a tiled decode process should the GPU run out of memory.

How to test

Run a sufficiently large image generation on a GPU with 8GB of VRAM or less. VAEDecode node will fail due to lack of GPU memory, especially on an AMD GPU that does not support rocm.

Potential side effects or concerns

The VAEDecode node has changed significantly to include batches, which is fine, however the tiled decoding method does not include such support. There is a possibility that the tiled method may fail here. I haven't been able to find any instances where it would fail with simple image generations. In these cases, it may be that the user is simply trying to do too much and should instead switch to a device that has a GPU with more VRAM.

patched VAEDecode if fails due to lack of VRAM fallback to tiled decode method
@comfyanonymous
Copy link
Owner

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

@traugdor
Copy link
Author

traugdor commented Nov 1, 2024

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

OS: Windows 10 Pro
PyTorch version:

torch==2.4.1
torch-directml==0.2.5.dev240914
torchsde==0.2.6
torchvision==0.19.1

I have a similar logic as what you linked, but it never works, at least not on my device.

@ltdrdata
Copy link
Collaborator

ltdrdata commented Nov 1, 2024

In your patch, can you show what kind of exception occurs when there is no exception handler?

@traugdor
Copy link
Author

traugdor commented Nov 1, 2024

In your patch, can you show what kind of exception occurs when there is no exception handler?

Here is the full error report when I remove my patch from the latest available version of ComfyUI.

ComfyUI Error Report

Error Details

  • Node Type: VAEDecode
  • Exception Type: RuntimeError
  • Exception Message: Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!

Stack Trace

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))

  File "D:\Stable Diffusion\ComfyUI\nodes.py", line 284, in decode
    images = vae.decode(samples["samples"])

  File "D:\Stable Diffusion\ComfyUI\comfy\sd.py", line 340, in decode
    out = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float())

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\models\autoencoder.py", line 200, in decode
    dec = self.decoder(dec, **decoder_kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 629, in forward
    h = self.mid.attn_1(h, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 287, in forward
    h_ = self.optimized_attention(q, k, v)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 206, in normal_attention
    r1 = slice_attention(q, k, v)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 182, in slice_attention
    s2 = torch.nn.functional.softmax(s1, dim=2).permute(0,2,1)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 1888, in softmax
    ret = input.softmax(dim)

System Information

  • ComfyUI Version: v0.2.6-6-gcc9cf6d
  • Arguments: main.py --listen --directml
  • OS: nt
  • Python Version: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
  • Embedded Python: false
  • PyTorch Version: 2.4.1+cpu

Devices

(PowerColor 6600xt 8gb will be reported incorrectly with directml. A1111 has this same issue.)

  • Name: privateuseone
    • Type: privateuseone
    • VRAM Total: 1073741824
    • VRAM Free: 1073741824
    • Torch VRAM Total: 1073741824
    • Torch VRAM Free: 1073741824

Logs

2024-11-01 10:19:24,401 - root - INFO - Using directml with device: 
2024-11-01 10:19:24,409 - root - INFO - Total VRAM 1024 MB, total RAM 32691 MB
2024-11-01 10:19:24,409 - root - INFO - pytorch version: 2.4.1+cpu
2024-11-01 10:19:24,411 - root - INFO - Set vram state to: NORMAL_VRAM
2024-11-01 10:19:24,411 - root - INFO - Device: privateuseone
2024-11-01 10:19:25,159 - root - INFO - Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
2024-11-01 10:19:26,148 - root - INFO - [Prompt Server] web root: D:\Stable Diffusion\ComfyUI\web
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path checkpoints ../stable-diffusion-webui-directml\models/Stable-diffusion
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path configs ../stable-diffusion-webui-directml\models/Stable-diffusion
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path vae ../stable-diffusion-webui-directml\models/VAE
2024-11-01 10:19:26,152 - root - INFO - Adding extra search path loras ../stable-diffusion-webui-directml\models/Lora
2024-11-01 10:19:26,152 - root - INFO - Adding extra search path loras ../stable-diffusion-webui-directml\models/LyCORIS
2024-11-01 10:19:26,153 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/ESRGAN
2024-11-01 10:19:26,153 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/RealESRGAN
2024-11-01 10:19:26,154 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/SwinIR
2024-11-01 10:19:26,154 - root - INFO - Adding extra search path embeddings ../stable-diffusion-webui-directml\embeddings
2024-11-01 10:19:26,155 - root - INFO - Adding extra search path hypernetworks ../stable-diffusion-webui-directml\models/hypernetworks
2024-11-01 10:19:26,155 - root - INFO - Adding extra search path controlnet ../stable-diffusion-webui-directml\models/ControlNet
2024-11-01 10:19:27,753 - root - INFO - Using directml with device: 
2024-11-01 10:19:27,762 - root - INFO - Total VRAM 1024 MB, total RAM 32691 MB
2024-11-01 10:19:27,762 - root - INFO - pytorch version: 2.4.1+cpu
2024-11-01 10:19:27,764 - root - INFO - Set vram state to: NORMAL_VRAM
2024-11-01 10:19:27,764 - root - INFO - Device: privateuseone
2024-11-01 10:19:29,689 - root - INFO - 
Import times for custom nodes:
2024-11-01 10:19:29,689 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\websocket_image_save.py
2024-11-01 10:19:29,689 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-dimension-node-modusCell
2024-11-01 10:19:29,690 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_GradientDeepShrink
2024-11-01 10:19:29,690 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Embedding_Picker
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\Harronode
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\stability-ComfyUI-nodes
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\cg-image-picker
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyuiLoopbackNodes_v01
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_TiledKSampler
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-stable-wildcards
2024-11-01 10:19:29,693 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\FreeU_Advanced
2024-11-01 10:19:29,693 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui-previewlatent
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\cg_custom_core
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Loopchain
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-quadMoons-nodes
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-N-Sidebar
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-OpenPose-Editor
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Custom-Scripts
2024-11-01 10:19:29,696 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfy-image-saver
2024-11-01 10:19:29,696 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyMath
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\Comfy_KepListStuff
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUi_NNLatentUpscale
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_UltimateSDUpscale
2024-11-01 10:19:29,698 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\WAS_Extras
2024-11-01 10:19:29,698 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_NestedNodeBuilder
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\rgthree-comfy
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\efficiency-nodes-comfyui
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui-dynamicprompts
2024-11-01 10:19:29,700 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus
2024-11-01 10:19:29,700 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui_controlnet_aux
2024-11-01 10:19:29,701 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\facerestore_cf
2024-11-01 10:19:29,701 - root - INFO -    0.1 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Inspire-Pack
2024-11-01 10:19:29,701 - root - INFO -    0.1 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_node_Lilly
2024-11-01 10:19:29,702 - root - INFO -    0.2 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_smZNodes
2024-11-01 10:19:29,702 - root - INFO -    0.3 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\facedetailer
2024-11-01 10:19:29,703 - root - INFO -    0.4 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Manager
2024-11-01 10:19:29,705 - root - INFO -    0.4 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Impact-Pack
2024-11-01 10:19:29,706 - root - INFO -    1.3 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\was-node-suite-comfyui
2024-11-01 10:19:29,706 - root - INFO - 
2024-11-01 10:19:29,719 - root - INFO - Starting server

2024-11-01 10:19:29,720 - root - INFO - To see the GUI go to: http://0.0.0.0:8188
2024-11-01 10:19:29,720 - root - INFO - To see the GUI go to: http://[::]:8188
2024-11-01 10:19:40,323 - root - INFO - got prompt
2024-11-01 10:19:40,413 - root - INFO - model weight dtype torch.float32, manual cast: None
2024-11-01 10:19:40,414 - root - INFO - model_type EPS
2024-11-01 10:19:41,152 - root - INFO - Using split attention in VAE
2024-11-01 10:19:41,153 - root - INFO - Using split attention in VAE
2024-11-01 10:19:41,463 - root - INFO - Requested to load SD1ClipModel
2024-11-01 10:19:41,464 - root - INFO - Loading 1 new model
2024-11-01 10:19:41,469 - root - INFO - loaded completely 0.0 235.84423828125 True
2024-11-01 10:19:42,010 - root - INFO - Requested to load BaseModel
2024-11-01 10:19:42,010 - root - INFO - Loading 1 new model
2024-11-01 10:19:45,723 - root - INFO - loaded completely 0.0 3278.812271118164 True
2024-11-01 10:20:07,643 - root - INFO - Requested to load AutoencoderKL
2024-11-01 10:20:07,643 - root - INFO - Loading 1 new model
2024-11-01 10:20:09,624 - root - INFO - loaded completely 0.0 319.11416244506836 True
2024-11-01 10:20:10,862 - root - ERROR - !!! Exception during processing !!! Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!
2024-11-01 10:20:10,863 - root - ERROR - Traceback (most recent call last):
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "D:\Stable Diffusion\ComfyUI\nodes.py", line 284, in decode
    images = vae.decode(samples["samples"])
  File "D:\Stable Diffusion\ComfyUI\comfy\sd.py", line 340, in decode
    out = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float())
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\models\autoencoder.py", line 200, in decode
    dec = self.decoder(dec, **decoder_kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 629, in forward
    h = self.mid.attn_1(h, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 287, in forward
    h_ = self.optimized_attention(q, k, v)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 206, in normal_attention
    r1 = slice_attention(q, k, v)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 182, in slice_attention
    s2 = torch.nn.functional.softmax(s1, dim=2).permute(0,2,1)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 1888, in softmax
    ret = input.softmax(dim)
RuntimeError: Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!

2024-11-01 10:20:10,865 - root - INFO - Prompt executed in 30.54 seconds

Attached Workflow

Please make sure that workflow does not contain any sensitive information such as API keys or passwords.

{"last_node_id":9,"last_link_id":9,"nodes":[{"id":7,"type":"CLIPTextEncode","pos":{"0":413,"1":389},"size":{"0":425.27801513671875,"1":180.6060791015625},"flags":{},"order":3,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":5}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[6],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["text, watermark"]},{"id":6,"type":"CLIPTextEncode","pos":{"0":415,"1":186},"size":{"0":422.84503173828125,"1":164.31304931640625},"flags":{},"order":2,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":3}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[4],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["beautiful scenery nature glass bottle landscape, , purple galaxy bottle,"]},{"id":8,"type":"VAEDecode","pos":{"0":1209,"1":188},"size":{"0":210,"1":46},"flags":{},"order":5,"mode":0,"inputs":[{"name":"samples","type":"LATENT","link":7},{"name":"vae","type":"VAE","link":8}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[9],"slot_index":0}],"properties":{"Node name for S&R":"VAEDecode"},"widgets_values":[]},{"id":9,"type":"SaveImage","pos":{"0":1451,"1":189},"size":{"0":210,"1":58},"flags":{},"order":6,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":9}],"outputs":[],"properties":{},"widgets_values":["ComfyUI"]},{"id":4,"type":"CheckpointLoaderSimple","pos":{"0":26,"1":474},"size":{"0":315,"1":98},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"MODEL","type":"MODEL","links":[1],"slot_index":0},{"name":"CLIP","type":"CLIP","links":[3,5],"slot_index":1},{"name":"VAE","type":"VAE","links":[8],"slot_index":2}],"properties":{"Node name for S&R":"CheckpointLoaderSimple"},"widgets_values":["moonmixHolidayMad_v05-fp16-no-ema.safetensors"]},{"id":3,"type":"KSampler","pos":{"0":863,"1":186},"size":[320,470],"flags":{},"order":4,"mode":0,"inputs":[{"name":"model","type":"MODEL","link":1},{"name":"positive","type":"CONDITIONING","link":4},{"name":"negative","type":"CONDITIONING","link":6},{"name":"latent_image","type":"LATENT","link":2}],"outputs":[{"name":"LATENT","type":"LATENT","links":[7],"slot_index":0}],"properties":{"Node name for S&R":"KSampler"},"widgets_values":[1024848246727402,"randomize",2,8,"euler","normal",1]},{"id":5,"type":"EmptyLatentImage","pos":{"0":473,"1":609},"size":{"0":315,"1":106},"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"name":"LATENT","type":"LATENT","links":[2],"slot_index":0}],"properties":{"Node name for S&R":"EmptyLatentImage"},"widgets_values":[1200,1200,1]}],"links":[[1,4,0,3,0,"MODEL"],[2,5,0,3,3,"LATENT"],[3,4,1,6,0,"CLIP"],[4,6,0,3,1,"CONDITIONING"],[5,4,1,7,0,"CLIP"],[6,7,0,3,2,"CONDITIONING"],[7,3,0,8,0,"LATENT"],[8,4,2,8,1,"VAE"],[9,8,0,9,0,"IMAGE"]],"groups":[],"config":{},"extra":{"ds":{"scale":1,"offset":[50,-16]}},"version":0.4}

Additional Context

none


As you can see, it's a pretty significant error when the AMD GPUs run out of memory. This error is specific to AMD GPUs that do not support ROCm. I have no way of testing if ROCm enabled devices handle or report out-of-memory errors differently. I hope this is enough information to support the need for this change. It's a RuntimeError not a OOMException from modelmanagement.py because torch was not compiled with cuda enabled and cannot be with AMD devices not on rocm.

see here for more information on the OOMException from the modelmanagement.py file

OOM_EXCEPTION = torch.cuda.OutOfMemoryError

@ltdrdata
Copy link
Collaborator

ltdrdata commented Nov 1, 2024

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

The original code only catches torch.cuda.OutOfMemoryError, but it looks like we need to enhance it to catch RuntimeError as well.

revert change to VAEDecode node.
Move catching of RuntimeError and MemoryError to sd.py
Remove unnecessary parameter to decode method
@traugdor
Copy link
Author

traugdor commented Nov 1, 2024

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344
Which OS and which pytorch version are you using?

The original code only catches torch.cuda.OutOfMemoryError, but it looks like we need to enhance it to catch RuntimeError as well.

That is what my PR does, catching the RuntimeError as well as any other MemoryError in the decoding process itself. I have modified the code to reflect this, moved it out of the node and into the sd.py file itself. I tested with image sizes of 1200x1200 on my GPU and it worked perfectly with this addition.

See screenshots:

workflow(5)
ComfyUI_temp_ognju_00001_
image

@traugdor
Copy link
Author

traugdor commented Nov 1, 2024

I believe, at this point, my list of potential side-effects or concerns has been reduced to none.

@huchenlei huchenlei added the AMD Issue related to AMD driver support. label Dec 16, 2024
@traugdor
Copy link
Author

traugdor commented Jan 6, 2025

Just checking in to see why this hasn't been merged yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMD Issue related to AMD driver support.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants