Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory #113

Open
pedrogcmartin opened this issue Feb 7, 2023 · 3 comments
Open

RuntimeError: CUDA out of memory #113

pedrogcmartin opened this issue Feb 7, 2023 · 3 comments

Comments

@pedrogcmartin
Copy link

When I try to train a 'tanks and temples' scenario my GPU runs out of memory, even when using a batch size of 1. Has anyone had this problem or know how to work around it? I am using a GPU with 8192 MiB of memory (GeForce RTX 3060 Ti Lite Hash Rate).

Defaulting to extended NSVF dataset
LOAD NSVF DATA data/TanksAndTempleBG/Truck split train
100%|█████████████████████████████████████████| 226/226 [00:02<00:00, 88.77it/s]
NORMALIZE BY? camera
scene_scale 1.4712800415356706
 intrinsics (loaded reso) Intrin(fx=581.7877197265625, fy=581.7877197265625, cx=490.25, cy=272.75)
 Generating rays, scaling factor 1
/home/pedro/anaconda3/envs/plenoxel/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Defaulting to extended NSVF dataset
LOAD NSVF DATA data/TanksAndTempleBG/Truck split test
100%|███████████████████████████████████████████| 25/25 [00:00<00:00, 90.10it/s]
NORMALIZE BY? camera
scene_scale 1.4712800415356706
 intrinsics (loaded reso) Intrin(fx=581.7877197265625, fy=581.7877197265625, cx=490.25, cy=272.75)
Render options RenderOptions(backend='cuvol', background_brightness=1.0, step_size=0.5, sigma_thresh=1e-08, stop_thresh=1e-07, last_sample_opaque=False, near_clip=0.0, use_spheric_clip=False, random_sigma_std=0.0, random_sigma_std_background=0.0)
 Selecting random rays
Eval step
100%|█████████████████████████████████████████████| 5/5 [00:01<00:00,  3.03it/s]
eval stats: {'psnr': 6.178773452430829, 'mse': 0.24226271510124206}
Train step
epoch 0 psnr=19.43: 100%|████████████████| 12800/12800 [00:35<00:00, 361.73it/s]
 Selecting random rays
Eval step
100%|███████████████████████████████████████████| 25/25 [00:06<00:00,  3.64it/s]
eval stats: {'psnr': 13.061549072084956, 'mse': 0.049624104797840116}
Train step
epoch 1 psnr=19.68: 100%|████████████████| 12800/12800 [00:32<00:00, 394.01it/s]
 Selecting random rays
Eval step
100%|███████████████████████████████████████████| 25/25 [00:06<00:00,  3.82it/s]
eval stats: {'psnr': 13.122755637793855, 'mse': 0.04888920813798905}
Train step
epoch 2 psnr=16.00: 100%|████████████████| 12800/12800 [00:31<00:00, 403.39it/s]
* Upsampling from [256, 256, 256] to [512, 512, 512]
turning off TV regularization
Pass 1/2 (density)
100%|███████████████████████████████████████| 187/187 [00:00<00:00, 9483.18it/s]
 Grid weight render torch.Size([512, 512, 512])
Pass 2/2 (color), eval 27227553 sparse pts
100%|██████████████████████████████████████████| 38/38 [00:00<00:00, 877.03it/s]
Traceback (most recent call last):
  File "opt.py", line 631, in <module>
    grid.resample(reso=reso_list[reso_id],
  File "/media/pedro/5e563db5-5ede-4fd6-8484-570f2f48099a/models/svox2/svox2/svox2.py", line 1394, in resample
    sample_vals_sh = torch.cat(all_sample_vals_sh, dim=0) if len(all_sample_vals_sh) else torch.empty_like(self.sh_data[:0])
RuntimeError: CUDA out of memory. Tried to allocate 2.74 GiB (GPU 0; 7.79 GiB total capacity; 4.27 GiB already allocated; 386.88 MiB free; 5.96 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for (plenoxel)
@pedrogcmartin
Copy link
Author

I managed to solved it by applying a grid resolution of [256, 256, 256] without any upsampling. Now the problem is with the render video quality, which is very bad (PSNR of ~13).

@pedrogcmartin
Copy link
Author

test_renders.mp4

@sarafridov
Copy link
Collaborator

Maybe you are using a config that is intended for bounded scenes? From the video it looks like the model is trying to squeeze everything into the foreground, which it shouldn't have to do if you allow a background. Try configs/tnt.json if you haven't already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants