Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-GPU Training #1

Open
LcenArthas opened this issue Mar 9, 2019 · 5 comments
Open

Multi-GPU Training #1

LcenArthas opened this issue Mar 9, 2019 · 5 comments

Comments

@LcenArthas
Copy link

Hi,
Have you tried to run training on multiple gpus?

@JKBox
Copy link
Owner

JKBox commented Mar 10, 2019

Hi,
Have you tried to run training on multiple gpus?

Thanks to your reminder, I wrote the code with single gpu, I will change it to multiple gpus later.

@LcenArthas
Copy link
Author

I tried, but failed TAT.....,but i found this: ultralytics/yolov3#121 . I tried to fix the code,but failed. I hope it can help u :)

@LcenArthas
Copy link
Author

by the way. i have fix the code follow by that url, and it can run in the multiple gpus, but it sooooo slow. So i think i have made the wrong code

@longxianlei
Copy link

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7"
if torch.cuda.device_count() > 1: model = nn.DataParallel(model, device_ids=[0, 1, 2, 3]) model.to(device).train()
I have 8 GPUs. I set 4 of my device visiable. Then i use the model to parallel to these GPUs.
but when i run the train.py.
inter_area = torch.min(box1, box2).prod(2) RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'
Is the code didn't support multi GPU training now.

@JKBox
Copy link
Owner

JKBox commented Mar 17, 2019

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7"
if torch.cuda.device_count() > 1: model = nn.DataParallel(model, device_ids=[0, 1, 2, 3]) model.to(device).train()
I have 8 GPUs. I set 4 of my device visiable. Then i use the model to parallel to these GPUs.
but when i run the train.py.
inter_area = torch.min(box1, box2).prod(2) RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'
Is the code didn't support multi GPU training now.

yes, the code only support single GPU training currently, I'll fix it later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants