Fix synchronization mistake in CUDAScopedContext #327

makortel · 2019-04-22T18:03:11Z

PR description:

This PR fixes a synchronization mistake in CUDAScopedContext reported by @VinInn e.g. here #318 (comment). The problem was restricted to cases where the producer

is not an ExternalWork
inserts the CUDA product to the event via CUDAScopedContext::emplace()
queued more asynchronous work in the constructor of the CUDA product

(all of these are fulfilled by the BeamSpot PR #318)

The problem was that CUDASCopedContext checked whether the CUDA stream is idle or not before calling the CUDA product constructor (for the CUDA event optimization of #292). If the stream is idle, the CUDAProduct is marked immediately to be available to avoid inspecting the state of the CUDA event (well, even creation of the event). But if the constructor queues more work, the state of the CUDAProduct is incorrect.

The proposed fix is to check for the CUDA stream status after calling the constructor.

PR validation:

Profiling workflow runs, unit tests run. Verified with printouts that now the product status information in the consumer of BeamSpot is consistent with the stream status information.

…called Otherwise, if the stream was idle before, and the constructor queues work to it, the event is not created and downstream will assume that the product is always there (even if it isn't yet).

makortel added 2 commits April 22, 2019 19:51

Add perfect forwarding overload for CUDAProduct constructor

c6ba8f0

Check the event creation only after the product constructor has been …

70a6d95

…called Otherwise, if the stream was idle before, and the constructor queues work to it, the event is not created and downstream will assume that the product is always there (even if it isn't yet).

makortel mentioned this pull request Apr 22, 2019

Move BeamSpot transfer to GPU to its own producer #318

Merged

VinInn mentioned this pull request Apr 23, 2019

CTD19 developments for review and merging (fixed) #329

Closed

Merge branch 'CMSSW_10_6_X_Patatrack' into fixCUDAProductSynchronize

b905303

fwyzard merged commit b96b789 into cms-patatrack:CMSSW_10_6_X_Patatrack Apr 23, 2019

fwyzard added bug-fix fixed labels Apr 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix synchronization mistake in CUDAScopedContext #327

Fix synchronization mistake in CUDAScopedContext #327

makortel commented Apr 22, 2019

Fix synchronization mistake in CUDAScopedContext #327

Fix synchronization mistake in CUDAScopedContext #327

Conversation

makortel commented Apr 22, 2019

PR description:

PR validation: