You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently SwitchProducer decides at the beginning of the job which of the case-producers it runs. Nevertheless, we've talked about enhancing the mechanism for event-by-event decisions if such flexibility is needed (the switch was designed such that such extension should not be too hard). This issue is to collect arguments for the event-by-event decisions:
Allows dynamic load balancing between CPU and GPU (etc)
Unsure how relevant this argument is as it essentially requires a situation where the GPU becomes full. Currently we're in the situation where GPU is rather empty (when doing anything beyond GPU work scheduling on the CPU)
if SwitchProducer could do the switching event-by-event, we could use the current GPU memory utilization (e.g. from the caching allocator)
about the "restart with CPU version if GPU fails e.g. because of not enough memory", doing it at event level (i.e. any module fails with a specific exception, restart event from scratch with different SwitchProducer choice) might be feasible (but also usefulness would be less clear)
would it be easier or harder to implement by-stream rather than by-event switching ?
The decision part would be trivial. in the framework changes side, some things might be a little bit simpler if EDM streams would always have the same behavior, but I can't tell without trying to implement it if those would be significantly simpler compared to generic solution. It could certainly be a useful stepping stone towards the generic solution.
Currently SwitchProducer decides at the beginning of the job which of the case-producers it runs. Nevertheless, we've talked about enhancing the mechanism for event-by-event decisions if such flexibility is needed (the switch was designed such that such extension should not be too hard). This issue is to collect arguments for the event-by-event decisions:
CUDAService
snumberOfStreamsPerDevice
(manual load-balancing attempt)The text was updated successfully, but these errors were encountered: