Try to be more threadsafe with EOSPAC #451

jhp-lanl · 2025-01-02T18:18:27Z

I think we've all been trying to develop around the fact that EOSPAC is not thread-safe, but I think maybe we should try to codify some development principles to try to make our EOSPAC classes approach something more like thread safety.

The main issue (as I see it) is that EOSPAC table options can be applied after the class has been initialized. I would propose forcing all EOSPAC table options to be applied only at initialization.

The biggest consequence of this would be that the Transform struct would not be able to be passed to the member functions. If a different transformation was desired, we'd need to create a new EOSPAC class with those transformations applied.

There might be memory usage implications, but I think the EOSPAC memory usage model might try to not load data twice if it can. This hypothesis is worth testing though.

Otherwise, I think this would help with reasoning about how EOSPAC class copying works. It might also reduce the hoops needed for the transformations to be applied.

@rbberger since you wrote this code initially, do you have thoughts? @jonahm-LANL your thoughts are also valuable here

The text was updated successfully, but these errors were encountered:

Yurlungur · 2025-01-02T18:20:20Z

I think this would prevent us from using modifiers of eospac the way we currently do, full stop. Right now some of the modifiers for eospac go through eospac options. This design would force us to instead use scratch memory for them always.

Yurlungur · 2025-01-02T18:21:46Z

However, as I understand it, the calls that change eospac table options are applied before and after the call. This is a race condition in the case when multiple EOSPAC calls are made on CPU in multiple threads. Is that the main concern here? Or are you concerned about class copying?

jhp-lanl · 2025-01-02T18:32:54Z

Really I was concerned about class copying, but the race condition is also a problem.

And yes, I didn't think about the modifiers at first but I'm thinking about them now. Is there a way to re-initialize the EOSPAC class when modifiers are applied? That would allow us to bake the transform into the modifier class and have that modifier class be tied to a specific EOSPAC table handle.

Yurlungur · 2025-01-02T18:40:57Z

Really I was concerned about class copying, but the race condition is also a problem.

Is there a way to re-initialize the EOSPAC class when modifiers are applied? That would allow us to bake the transform into the modifier class and have that modifier class be tied to a specific EOSPAC table handle.

I don't know... That might be a question for David. Can one create new table handles? One issue with that might be that it would need to be done before any shared memory or device-copying is done, as that is a one-and-done in the EOSPAC model.

Of course, we could also get around this by simply not short-circuiting singularity when doing modifiers. If all modifiers used scratch memory for vector calls, the problem would also go away. Though I don't know if we want to do that.

I think this may merit a longer discussion, as the issue with modifiers may not be a concern if we only care about copying, as the options applied to EOSPAC are applied before and then undone after each call. So it's safe if the calls are made in serial from the host.

On the other hand, if you're imagining us calling EOSPAC objects from multiple POSIX threads (as opposed to GPU threads) at once, then we're in trouble.

jhp-lanl · 2025-01-02T20:23:15Z

On the other hand, if you're imagining us calling EOSPAC objects from multiple POSIX threads (as opposed to GPU threads) at once, then we're in trouble.

Why aren't we already in trouble for GPU threads?

Yurlungur · 2025-01-02T20:42:55Z

Ah, we are. But if we ask EOSPAC to manage the threads rather than be called from inside one (which is they're model) then we're safe.

jhp-lanl · 2025-01-02T21:51:13Z

But yeah, I think a broader conversation is warranted. I just wanted to make an issue to think about it more. This came up when I was thinking about how we manage our classes and what happens with shared memory and such.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to be more threadsafe with EOSPAC #451

Try to be more threadsafe with EOSPAC #451

jhp-lanl commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025 •

edited

Loading

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025

Try to be more threadsafe with EOSPAC #451

Try to be more threadsafe with EOSPAC #451

Comments

jhp-lanl commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025 • edited Loading

Yurlungur commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025

jhp-lanl commented Jan 2, 2025 •

edited

Loading