Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in inference - not enough values to unpack (expected 2, got 0) #115

Open
FlorinM25 opened this issue Nov 20, 2023 · 11 comments
Open

Comments

@FlorinM25
Copy link

Hello,
Firstly, thank you very much for this amazing project!

When I want to run some demos with the commands presented in the README file I always get this error:
ii, jj = torch.as_tensor(es, device=self.device).unbind(dim=-1)

The terminal looks like this:

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
220it [00:19, 11.21it/s]
################################
  File "envsvenv\vis\DROID-SLAM\droid_slam\droid.py", line 96, in terminate
    self.backend(7) # Run the backend process with argument 7
  File "envsvenv\vis\droidvenvvis\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "droid_slam\droid_backend.py", line 66, in __call__
    graph.add_proximity_factors(rad=self.backend_radius,
  File "droid_slam\factor_graph.py", line 437, in add_proximity_factors
    ii, jj = torch.as_tensor(es, device=self.device).unbind(dim=-1)
ValueError: not enough values to unpack (expected 2, got 0)
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

When the demo is running, when the images are iterated, the Open3d window opens but nothing appears on it.

After some debugging in the factor_graph.py file, I noticed that tensors ii and jj are [0] for all the running process, as well as the es array which is always empty.

I tried to use the --reconstruction_path flag to save the recon files. I get disps.npy, images.npy, intrinsics.npy, poses.npy, tstamps.npy. The .npy files have some values in them, but I doubt the fact that they are correct because the disps.npy file looks like this:

[[[0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  ...
  [0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]]]

I also tried to disable visualization as said in issue #76 with --disable_vis flag but the process just stops after some iterations:

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
10it [01:52,  3.51s/it]
Process finished with exit code -1073741819 (0xC0000005)

In issue #13 a datapath is mentioned, but I am not sure what it refers to..

I am working on Windows in a virtualenv in which I installed PyTorch 2.1.1 with cuda11.8 (I tried with torch 1.10 and cuda11.3 but the same error occurred). The GPU I tested on was a 3080TI with 12gb VRAM.

I assume this is a CUDA related issue, but I am sure in what way.

I hope someone can help me fix my errors. Thank you!

@Sebastian-Garcia
Copy link

Were you able to solve this? I too am using a 3080Ti and am facing this same issue when running with CUDA 11.3

@FlorinM25
Copy link
Author

Hello, I wasn't able to solve this on Windows, but I managed to make it work on Linux Ubuntu 22.04 (I don't think DROID-SLAM can work on Windows). I installed CUDA 12.2 from the NVIDIA website and used PyTorch with CUDA 12.1 with pip3 install from the official PyTorch website. For the environment I used the virtualenv package from pip instead of conda. The version of python that I used is 3.8. I installed the rest of the packages in the environment with pip commands. Additionally, I installed the ninja package with pip install ninja because python setup.py install will be faster. I hope this will help you!

@robofar
Copy link

robofar commented Mar 9, 2024

@FlorinM25 I tried with pytorch=2.1.1 and cuda=12.1 and python=3.8 but I got libcudart error
@Sebastian-Garcia I also tried with pytorch=1.10.1 and cuda=11.3 and python=3.9 but I am getting this unpack error. Did you figure it out in the end? What pytorch, cuda and python versions you use?

@andrewnc
Copy link

I'm getting the same unpack error because the distance comparison https://github.com/princeton-vl/DROID-SLAM/blob/main/droid_slam/factor_graph.py#L322 comes back as close to zero and gets set to inf which is then skipped https://github.com/princeton-vl/DROID-SLAM/blob/main/droid_slam/factor_graph.py#L352

@Dong09
Copy link

Dong09 commented Aug 12, 2024

Were you able to solve this?

@estaudere
Copy link

Would also like an update on this!

@Ysc-shark
Copy link

I initially encountered this issue as well, but later discovered that it was indeed due to a problem with the datapath. For example, the script provided by the author uses the path 'TUM-RGBD,' but in my case, the folder was actually named 'TUM_RGBD.' I wonder if anyone else is facing a similar issue?

My running environment is: Ubuntu 20.04, RTX 3090, Python 3.9, PyTorch 1.10, CUDA 11.3. I installed the environment using the yaml file content provided by Yaxun-Yang in #28.

@FlorinM25
Copy link
Author

Hello!
You can see this txt file: https://github.com/FlorinM25/DROID-SLAM/blob/main/working-with-DROID-SLAM.txt.
Here are all the steps and information I gathered while working with DROID-SLAM regarding setup. I hope you find them helpful.

@XichongLing
Copy link

In my case it was caused by the video.counter.value==1 when the droidbackend was invoked. The reason is that camera pose shifts in my dataset are so minor that is below the motion filter thresh (args.filter_thresh) and no frames were added during the track process.

@YuxinYao620
Copy link

In my case it was caused by the video.counter.value==1 when the droidbackend was invoked. The reason is that camera pose shifts in my dataset are so minor that is below the motion filter thresh (args.filter_thresh) and no frames were added during the track process.

Hello Xichong,
May I ask how did you you resolve it? I am also dealing with some dataset with minor shifts. Thank you in advance!

@XichongLing
Copy link

In my case it was caused by the video.counter.value==1 when the droidbackend was invoked. The reason is that camera pose shifts in my dataset are so minor that is below the motion filter thresh (args.filter_thresh) and no frames were added during the track process.

Hello Xichong, May I ask how did you you resolve it? I am also dealing with some dataset with minor shifts. Thank you in advance!

Set the args.filter_thresh to a smaller number can get the program running. If you know your sequence is monocular, you can skip this program and manually set the extrinsics motion sequences to identical matrices (I assume you are estimating the camera motions from an outer project).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants