-
-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-threaded custom pipelines #1003
Multi-threaded custom pipelines #1003
Conversation
As for the struct ecs_pipeline_state_t;
typedef struct ecs_pipeline_state_t ecs_pipeline_state_t; |
075b8d9
to
1106b3e
Compare
1106b3e
to
1a460ab
Compare
@codylico Yeah the first thing I did was add In file included from ./src/addons/pipeline/worker.c:10:
./src/addons/pipeline/pipeline.h:38:3: error: redefinition of typedef 'ecs_pipeline_state_t' is a C11 feature [-Werror,-Wtypedef-redefinition]
} ecs_pipeline_state_t;
^
./src/addons/pipeline/../system/../../private_types.h:484:37: note: previous definition is here
typedef struct ecs_pipeline_state_t ecs_pipeline_state_t;
^ I've now fixed that by just removing the typedef from the original definition of the struct, that seems to work fine but still unsure if this is the correct way to do this for the project or if the code should be organized in a different way. |
Not sure why this test failed. Seems unrelated to these changes, I tested it locally on macOS and didn't have any issues. The test also passed in the It also exited wth signal 4, which is SIGILL which seems strange. https://github.com/SanderMertens/flecs/actions/runs/5568489788/jobs/10171153134?pr=1003#step:6:126
|
Rerunning the failed tests, sometimes CI flukes and a process running a test is killed. |
LGTM, I like it! Thanks for the PR :) |
Changelog
ecs_run_pipeline
to execute multi-threaded systems on the worker threads.run_pipeline_multithreaded
andrun_pipeline_multithreaded_tasks
.ecs_set_pipeline
ecs_progress
to block forever (Report on Discord)TODO
system_time_total
is calculatedNotes
system_time_total
is measured with multi-threaded systems is not accurate as it only measures the work the main thread does, but this may not be accurate as a worker thread could have more or less work. It would be straightforward with the changes in this branch to instead measure the system time including thread synchronization. This would makesystem_time_total
the wall time of the system.Details
Multi-threaded
ecs_run_pipeline
ecs_run_pipeline
to callflecs_workers_progress
and to more closely mirrorecs_progress
with support for task threads.ecs_worker_state_t
and instead passecs_pipeline_state_t*
via theecs_stage_t
allowing for it to be dynamically updated.flecs_run_pipeline
into a new functionflecs_run_pipeline_ops
.flecs_worker
to now callflecs_run_pipeline_ops
so it only operates on a specific pipeline operation, making the synchronization with the main thread clearer.flecs_run_pipeline
to now only be the main-thread operations handling updating the pipeline, updating the pipeline operation, waking works, synchronizing workers and also callsflecs_run_pipeline_ops
to execute the pipeline operation for the main thread.flecs_worker_begin
,flecs_worker_end
,flecs_worker_sync
,flecs_is_multithreaded
andflecs_is_main_thread
as they were no longer needed anymore.flecs_signal_workers
andflecs_wait_for_sync
fromworker.c
topipeline.c
inpipeline.h
.ecs_set_threads
fromecs_set_pipeline
now that the workers can operate on any pipeline dynamically so there is no need to restart them.Some info on how this changes worker synchronization.
I ran both
master
and the branch for this PR with log level 3 enabled in a release build to compare how the workers operate.Here in the following diff, you can see the functional difference.
In this diff, you can see a few changes.
SysB
andSysC
, and then also lower down forSysF
the worker threads are no longer signalled since the operation is not marked asmulti_threaded
. This means we can avoid needing to synchronize on the worker threads when not needed.SysD
andSysE
there are new log messages forworker %d: run
, this is because the workers now go back to their main loop after executing one operation, rather than synchronizing internally inflecs_run_pipeline
.info: | merge
), we no longer wake the worker threads an additional time. This is possible since the workers are staying in their main loop when waiting for the next operation to execute. This effectively removes an unneeded sync point.Notes about log collection:
flecs_log_msg
to lock while writing to make the logging to be more stableecs_run_intern
so its clearer what it is when viewed next to theworker %d: run
message:eg.
I've attached the full logs in case someone wants to look at them.
master.txt
multithreaded-custom-piepline.txt
Deadlock when using
ecs_run_pipeline
with worker threads enabled (Report on Discord)Fixed a bug in
flecs_worker_begin
where if an empty pipeline was run it would causeecs_progress
to block foreverflecs_worker_begin
signalling the workers to start but thenflecs_run_pipeline
would immediately return if there are no operations to perform, meaning the worker threads were never synchronized before exiting.flecs/src/addons/pipeline/pipeline.c
Lines 531 to 533 in d24a9e1
Removed unneeded worker signal and sync at end of pipeline
While working on this I noticed that when a pipeline is run across the worker threads (eg.
ecs_progress
) at the end of the pipeline when no more operations existflecs_worker_sync
would callflecs_worker_begin
which would then signal the workers, only for the loop to exit and thenflecs_worker_end
to immediately be called.