Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

traugdor · 2024-10-30T14:16:49Z

Issue

As the title suggests, if the GPU runs out of VRAM during a VAEDecode event, then the entire queued prompt fails. With as many enhancements as ComfyUI has received over the last months, it is all but certain that this failure should no longer exist.

Solution

I have introduced a small change to the VAEDecode node so that it will fallback to a tiled decode process should the GPU run out of memory.

How to test

Run a sufficiently large image generation on a GPU with 8GB of VRAM or less. VAEDecode node will fail due to lack of GPU memory, especially on an AMD GPU that does not support rocm.

Potential side effects or concerns

The VAEDecode node has changed significantly to include batches, which is fine, however the tiled decoding method does not include such support. There is a possibility that the tiled method may fail here. I haven't been able to find any instances where it would fail with simple image generations. In these cases, it may be that the user is simply trying to do too much and should instead switch to a device that has a GPU with more VRAM.

patched VAEDecode if fails due to lack of VRAM fallback to tiled decode method

comfyanonymous · 2024-10-31T19:26:26Z

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

traugdor · 2024-11-01T06:34:05Z

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

OS: Windows 10 Pro
PyTorch version:

torch==2.4.1
torch-directml==0.2.5.dev240914
torchsde==0.2.6
torchvision==0.19.1

I have a similar logic as what you linked, but it never works, at least not on my device.

ltdrdata · 2024-11-01T06:40:23Z

In your patch, can you show what kind of exception occurs when there is no exception handler?

traugdor · 2024-11-01T15:34:12Z

In your patch, can you show what kind of exception occurs when there is no exception handler?

Here is the full error report when I remove my patch from the latest available version of ComfyUI.

ComfyUI Error Report

Error Details

Node Type: VAEDecode
Exception Type: RuntimeError
Exception Message: Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!

Stack Trace

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)

  File "D:\Stable Diffusion\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))

  File "D:\Stable Diffusion\ComfyUI\nodes.py", line 284, in decode
    images = vae.decode(samples["samples"])

  File "D:\Stable Diffusion\ComfyUI\comfy\sd.py", line 340, in decode
    out = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float())

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\models\autoencoder.py", line 200, in decode
    dec = self.decoder(dec, **decoder_kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 629, in forward
    h = self.mid.attn_1(h, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 287, in forward
    h_ = self.optimized_attention(q, k, v)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 206, in normal_attention
    r1 = slice_attention(q, k, v)

  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 182, in slice_attention
    s2 = torch.nn.functional.softmax(s1, dim=2).permute(0,2,1)

  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 1888, in softmax
    ret = input.softmax(dim)

System Information

ComfyUI Version: v0.2.6-6-gcc9cf6d
Arguments: main.py --listen --directml
OS: nt
Python Version: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Embedded Python: false
PyTorch Version: 2.4.1+cpu

Devices

(PowerColor 6600xt 8gb will be reported incorrectly with directml. A1111 has this same issue.)

Name: privateuseone
- Type: privateuseone
- VRAM Total: 1073741824
- VRAM Free: 1073741824
- Torch VRAM Total: 1073741824
- Torch VRAM Free: 1073741824

Logs

2024-11-01 10:19:24,401 - root - INFO - Using directml with device: 
2024-11-01 10:19:24,409 - root - INFO - Total VRAM 1024 MB, total RAM 32691 MB
2024-11-01 10:19:24,409 - root - INFO - pytorch version: 2.4.1+cpu
2024-11-01 10:19:24,411 - root - INFO - Set vram state to: NORMAL_VRAM
2024-11-01 10:19:24,411 - root - INFO - Device: privateuseone
2024-11-01 10:19:25,159 - root - INFO - Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
2024-11-01 10:19:26,148 - root - INFO - [Prompt Server] web root: D:\Stable Diffusion\ComfyUI\web
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path checkpoints ../stable-diffusion-webui-directml\models/Stable-diffusion
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path configs ../stable-diffusion-webui-directml\models/Stable-diffusion
2024-11-01 10:19:26,151 - root - INFO - Adding extra search path vae ../stable-diffusion-webui-directml\models/VAE
2024-11-01 10:19:26,152 - root - INFO - Adding extra search path loras ../stable-diffusion-webui-directml\models/Lora
2024-11-01 10:19:26,152 - root - INFO - Adding extra search path loras ../stable-diffusion-webui-directml\models/LyCORIS
2024-11-01 10:19:26,153 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/ESRGAN
2024-11-01 10:19:26,153 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/RealESRGAN
2024-11-01 10:19:26,154 - root - INFO - Adding extra search path upscale_models ../stable-diffusion-webui-directml\models/SwinIR
2024-11-01 10:19:26,154 - root - INFO - Adding extra search path embeddings ../stable-diffusion-webui-directml\embeddings
2024-11-01 10:19:26,155 - root - INFO - Adding extra search path hypernetworks ../stable-diffusion-webui-directml\models/hypernetworks
2024-11-01 10:19:26,155 - root - INFO - Adding extra search path controlnet ../stable-diffusion-webui-directml\models/ControlNet
2024-11-01 10:19:27,753 - root - INFO - Using directml with device: 
2024-11-01 10:19:27,762 - root - INFO - Total VRAM 1024 MB, total RAM 32691 MB
2024-11-01 10:19:27,762 - root - INFO - pytorch version: 2.4.1+cpu
2024-11-01 10:19:27,764 - root - INFO - Set vram state to: NORMAL_VRAM
2024-11-01 10:19:27,764 - root - INFO - Device: privateuseone
2024-11-01 10:19:29,689 - root - INFO - 
Import times for custom nodes:
2024-11-01 10:19:29,689 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\websocket_image_save.py
2024-11-01 10:19:29,689 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-dimension-node-modusCell
2024-11-01 10:19:29,690 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_GradientDeepShrink
2024-11-01 10:19:29,690 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Embedding_Picker
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\Harronode
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\stability-ComfyUI-nodes
2024-11-01 10:19:29,691 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\cg-image-picker
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyuiLoopbackNodes_v01
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_TiledKSampler
2024-11-01 10:19:29,692 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-stable-wildcards
2024-11-01 10:19:29,693 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\FreeU_Advanced
2024-11-01 10:19:29,693 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui-previewlatent
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\cg_custom_core
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Loopchain
2024-11-01 10:19:29,694 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-quadMoons-nodes
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-N-Sidebar
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-OpenPose-Editor
2024-11-01 10:19:29,695 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Custom-Scripts
2024-11-01 10:19:29,696 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfy-image-saver
2024-11-01 10:19:29,696 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyMath
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\Comfy_KepListStuff
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUi_NNLatentUpscale
2024-11-01 10:19:29,697 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_UltimateSDUpscale
2024-11-01 10:19:29,698 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\WAS_Extras
2024-11-01 10:19:29,698 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_NestedNodeBuilder
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\rgthree-comfy
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\efficiency-nodes-comfyui
2024-11-01 10:19:29,699 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui-dynamicprompts
2024-11-01 10:19:29,700 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus
2024-11-01 10:19:29,700 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\comfyui_controlnet_aux
2024-11-01 10:19:29,701 - root - INFO -    0.0 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\facerestore_cf
2024-11-01 10:19:29,701 - root - INFO -    0.1 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Inspire-Pack
2024-11-01 10:19:29,701 - root - INFO -    0.1 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_node_Lilly
2024-11-01 10:19:29,702 - root - INFO -    0.2 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI_smZNodes
2024-11-01 10:19:29,702 - root - INFO -    0.3 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\facedetailer
2024-11-01 10:19:29,703 - root - INFO -    0.4 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Manager
2024-11-01 10:19:29,705 - root - INFO -    0.4 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\ComfyUI-Impact-Pack
2024-11-01 10:19:29,706 - root - INFO -    1.3 seconds: D:\Stable Diffusion\ComfyUI\custom_nodes\was-node-suite-comfyui
2024-11-01 10:19:29,706 - root - INFO - 
2024-11-01 10:19:29,719 - root - INFO - Starting server

2024-11-01 10:19:29,720 - root - INFO - To see the GUI go to: http://0.0.0.0:8188
2024-11-01 10:19:29,720 - root - INFO - To see the GUI go to: http://[::]:8188
2024-11-01 10:19:40,323 - root - INFO - got prompt
2024-11-01 10:19:40,413 - root - INFO - model weight dtype torch.float32, manual cast: None
2024-11-01 10:19:40,414 - root - INFO - model_type EPS
2024-11-01 10:19:41,152 - root - INFO - Using split attention in VAE
2024-11-01 10:19:41,153 - root - INFO - Using split attention in VAE
2024-11-01 10:19:41,463 - root - INFO - Requested to load SD1ClipModel
2024-11-01 10:19:41,464 - root - INFO - Loading 1 new model
2024-11-01 10:19:41,469 - root - INFO - loaded completely 0.0 235.84423828125 True
2024-11-01 10:19:42,010 - root - INFO - Requested to load BaseModel
2024-11-01 10:19:42,010 - root - INFO - Loading 1 new model
2024-11-01 10:19:45,723 - root - INFO - loaded completely 0.0 3278.812271118164 True
2024-11-01 10:20:07,643 - root - INFO - Requested to load AutoencoderKL
2024-11-01 10:20:07,643 - root - INFO - Loading 1 new model
2024-11-01 10:20:09,624 - root - INFO - loaded completely 0.0 319.11416244506836 True
2024-11-01 10:20:10,862 - root - ERROR - !!! Exception during processing !!! Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!
2024-11-01 10:20:10,863 - root - ERROR - Traceback (most recent call last):
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "D:\Stable Diffusion\ComfyUI\execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "D:\Stable Diffusion\ComfyUI\nodes.py", line 284, in decode
    images = vae.decode(samples["samples"])
  File "D:\Stable Diffusion\ComfyUI\comfy\sd.py", line 340, in decode
    out = self.process_output(self.first_stage_model.decode(samples).to(self.output_device).float())
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\models\autoencoder.py", line 200, in decode
    dec = self.decoder(dec, **decoder_kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 629, in forward
    h = self.mid.attn_1(h, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\modules\module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 287, in forward
    h_ = self.optimized_attention(q, k, v)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 206, in normal_attention
    r1 = slice_attention(q, k, v)
  File "D:\Stable Diffusion\ComfyUI\comfy\ldm\modules\diffusionmodules\model.py", line 182, in slice_attention
    s2 = torch.nn.functional.softmax(s1, dim=2).permute(0,2,1)
  File "D:\Stable Diffusion\ComfyUI\venv\lib\site-packages\torch\nn\functional.py", line 1888, in softmax
    ret = input.softmax(dim)
RuntimeError: Could not allocate tensor with 2025000000 bytes. There is not enough GPU video memory available!

2024-11-01 10:20:10,865 - root - INFO - Prompt executed in 30.54 seconds

Attached Workflow

Please make sure that workflow does not contain any sensitive information such as API keys or passwords.

{"last_node_id":9,"last_link_id":9,"nodes":[{"id":7,"type":"CLIPTextEncode","pos":{"0":413,"1":389},"size":{"0":425.27801513671875,"1":180.6060791015625},"flags":{},"order":3,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":5}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[6],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["text, watermark"]},{"id":6,"type":"CLIPTextEncode","pos":{"0":415,"1":186},"size":{"0":422.84503173828125,"1":164.31304931640625},"flags":{},"order":2,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":3}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[4],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["beautiful scenery nature glass bottle landscape, , purple galaxy bottle,"]},{"id":8,"type":"VAEDecode","pos":{"0":1209,"1":188},"size":{"0":210,"1":46},"flags":{},"order":5,"mode":0,"inputs":[{"name":"samples","type":"LATENT","link":7},{"name":"vae","type":"VAE","link":8}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[9],"slot_index":0}],"properties":{"Node name for S&R":"VAEDecode"},"widgets_values":[]},{"id":9,"type":"SaveImage","pos":{"0":1451,"1":189},"size":{"0":210,"1":58},"flags":{},"order":6,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":9}],"outputs":[],"properties":{},"widgets_values":["ComfyUI"]},{"id":4,"type":"CheckpointLoaderSimple","pos":{"0":26,"1":474},"size":{"0":315,"1":98},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"MODEL","type":"MODEL","links":[1],"slot_index":0},{"name":"CLIP","type":"CLIP","links":[3,5],"slot_index":1},{"name":"VAE","type":"VAE","links":[8],"slot_index":2}],"properties":{"Node name for S&R":"CheckpointLoaderSimple"},"widgets_values":["moonmixHolidayMad_v05-fp16-no-ema.safetensors"]},{"id":3,"type":"KSampler","pos":{"0":863,"1":186},"size":[320,470],"flags":{},"order":4,"mode":0,"inputs":[{"name":"model","type":"MODEL","link":1},{"name":"positive","type":"CONDITIONING","link":4},{"name":"negative","type":"CONDITIONING","link":6},{"name":"latent_image","type":"LATENT","link":2}],"outputs":[{"name":"LATENT","type":"LATENT","links":[7],"slot_index":0}],"properties":{"Node name for S&R":"KSampler"},"widgets_values":[1024848246727402,"randomize",2,8,"euler","normal",1]},{"id":5,"type":"EmptyLatentImage","pos":{"0":473,"1":609},"size":{"0":315,"1":106},"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"name":"LATENT","type":"LATENT","links":[2],"slot_index":0}],"properties":{"Node name for S&R":"EmptyLatentImage"},"widgets_values":[1200,1200,1]}],"links":[[1,4,0,3,0,"MODEL"],[2,5,0,3,3,"LATENT"],[3,4,1,6,0,"CLIP"],[4,6,0,3,1,"CONDITIONING"],[5,4,1,7,0,"CLIP"],[6,7,0,3,2,"CONDITIONING"],[7,3,0,8,0,"LATENT"],[8,4,2,8,1,"VAE"],[9,8,0,9,0,"IMAGE"]],"groups":[],"config":{},"extra":{"ds":{"scale":1,"offset":[50,-16]}},"version":0.4}

Additional Context

none

As you can see, it's a pretty significant error when the AMD GPUs run out of memory. This error is specific to AMD GPUs that do not support ROCm. I have no way of testing if ROCm enabled devices handle or report out-of-memory errors differently. I hope this is enough information to support the need for this change. It's a RuntimeError not a OOMException from modelmanagement.py because torch was not compiled with cuda enabled and cannot be with AMD devices not on rocm.

see here for more information on the OOMException from the modelmanagement.py file

ComfyUI/comfy/model_management.py

Line 153 in cc9cf6d

OOM_EXCEPTION = torch.cuda.OutOfMemoryError

ltdrdata · 2024-11-01T20:47:26Z

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344

Which OS and which pytorch version are you using?

The original code only catches torch.cuda.OutOfMemoryError, but it looks like we need to enhance it to catch RuntimeError as well.

revert change to VAEDecode node.

Move catching of RuntimeError and MemoryError to sd.py

Remove unnecessary parameter to decode method

traugdor · 2024-11-01T21:27:52Z

There is already logic to handle this here: https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L344
Which OS and which pytorch version are you using?

The original code only catches torch.cuda.OutOfMemoryError, but it looks like we need to enhance it to catch RuntimeError as well.

That is what my PR does, catching the RuntimeError as well as any other MemoryError in the decoding process itself. I have modified the code to reflect this, moved it out of the node and into the sd.py file itself. I tested with image sizes of 1200x1200 on my GPU and it worked perfectly with this addition.

See screenshots:

traugdor · 2024-11-01T21:30:52Z

I believe, at this point, my list of potential side-effects or concerns has been reduced to none.

traugdor · 2025-01-06T17:37:57Z

Just checking in to see why this hasn't been merged yet.

Update nodes.py

83be01c

patched VAEDecode if fails due to lack of VRAM fallback to tiled decode method

traugdor requested a review from comfyanonymous as a code owner October 30, 2024 14:16

traugdor added 3 commits November 1, 2024 16:16

Update nodes.py

af0825f

revert change to VAEDecode node.

Update sd.py

519e776

Move catching of RuntimeError and MemoryError to sd.py

Update nodes.py

9cfd46b

Remove unnecessary parameter to decode method

traugdor mentioned this pull request Nov 19, 2024

KSampler slowing down when processing is interrupted #5471

Closed

huchenlei added the AMD Issue related to AMD driver support. label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

traugdor commented Oct 30, 2024

comfyanonymous commented Oct 31, 2024

traugdor commented Nov 1, 2024

ltdrdata commented Nov 1, 2024

traugdor commented Nov 1, 2024 •

edited

Loading

ltdrdata commented Nov 1, 2024

traugdor commented Nov 1, 2024

traugdor commented Nov 1, 2024

traugdor commented Jan 6, 2025

Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

Are you sure you want to change the base?

Update nodes.py so VAEDecode falls back to a tiled method if GPU runs out of memory #5427

Conversation

traugdor commented Oct 30, 2024

Issue

Solution

How to test

Potential side effects or concerns

comfyanonymous commented Oct 31, 2024

traugdor commented Nov 1, 2024

ltdrdata commented Nov 1, 2024

traugdor commented Nov 1, 2024 • edited Loading

ComfyUI Error Report

Error Details

Stack Trace

System Information

Devices

Logs

Attached Workflow

Additional Context

ltdrdata commented Nov 1, 2024

traugdor commented Nov 1, 2024

traugdor commented Nov 1, 2024

traugdor commented Jan 6, 2025

traugdor commented Nov 1, 2024 •

edited

Loading