Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CachingHostAllocator for multiple GPUs #212

Merged

Conversation

makortel
Copy link

@makortel makortel commented Dec 4, 2018

The CachingHostAllocator uses an associated CUDA stream and device for "asynchronous free" (to support the creation+transfer of "me" pointer in data formats

auto view = cs->make_host_unique<DeviceConstView>(stream);
view->xx_ = xx_d.get();
view->yy_ = yy_d.get();
view->adc_ = adc_d.get();
view->moduleInd_ = moduleInd_d.get();
view_d = cs->make_device_unique<DeviceConstView>(stream);
cudaCheck(cudaMemcpyAsync(view_d.get(), view.get(), sizeof(DeviceConstView), cudaMemcpyDefault, stream.id()));
}

). The implementation missed one detail regarding multiple GPUs: when claiming a previously-cached memory block, the current device may differ from the device of the previous allocation, and in that case, the CUDA event must be re-created for the new device.

This PR fixes that behavior, and should fix the crashes reported in #208 (comment).

@fwyzard

@fwyzard fwyzard merged commit 4684349 into cms-patatrack:CMSSW_10_4_X_Patatrack Dec 7, 2018
@fwyzard fwyzard added this to the CMSSW_10_4_0_pre3_Patatrack milestone Dec 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants