Skip to content

Commit

Permalink
Add support for pinning actors to a dedicated scheduler thread (#4547)
Browse files Browse the repository at this point in the history
* Add support for pinning actors to a dedicated scheduler thread

The overall design goal and approach was to make it possible to
have pinned actors while minimizing impact to pre-existing
non-pinned actor workloads. This meant there could be no impact
on message sends (i.e. can't check to see if the receiving actor
is a pinned actor or not to decide what to do with it if it is
unscheduled). These goals were chosen because it is expected
that `pinned` actors will be a niche/small part of any pony
application's overall workload.

The approach taken has negligible performance impact to existing
scheduler logic. It adds a couple of extra checks to see if an
actor that is ready to run is a pinned actor or not and if not,
there is no other overhead involved. The scheduler quiescence
logic has an extra check for an atomic counter of pinned actors
but that is also negligible if no pinned actors are ever used.

The overall logic for pinned actors works as follows:

* The `main` thread is dedicated to running pinned actors (and
only pinned actors). This thread previously initialized the
runtime and then sat around waiting for all schedulers to reach
quiescence so now it runs pinned actors in the meantime if there
are any.
* The `pinned actor thread` (`main`) runs a custom run loop for
pinned actors that does not participate in work stealing or any
other normal scheduler messaging except for unmuting messages and
the termination message. It also will only ever run `pinned` actors
and any non-`pinned` actors will get pushed onto the `inject` queue.
* Normal schedulers will only ever run non-`pinned` actors and any
`pinned` actors will get pushed onto the `pinned actor thread`'s
queue.
* From an api perspective, there is now an `actor_pinning` package
in the stdlib. An actor can request to be pinned, check that it
has successfully been pinned (so that it can safely do whatever it
needs to do while pinned), and request to be unpinned.

While the above is not necessarily the most efficient way to run
`pinned` actors, it meets the original design goals of making it
possible while minimizing impact of pre-existing non-pinned actor
workloads.

* Fix unused variable error

* Fix dtrace undeclared variable error

* Fix windows pinned-actor test linking error

* Add greedy actor caveat to actor pinning package

* Add release notes

* Rename `pin` to `request_pin` and `unpin` to `request_unpin`

and also update the package documentation to clarify that `pinning`
is not an immediate action

* update release notes

* Use correct function names now that they've been renamed

* update caveat

Co-authored-by: Sean T Allen <[email protected]>

* Fatten up the release notes

* make pinned actor thread participate in CNF/ACK for termination

* add option to pin pinned actor thread

---------

Co-authored-by: Sean T Allen <[email protected]>
  • Loading branch information
dipinhora and SeanTAllen authored Dec 5, 2024
1 parent fd928fb commit 9d178ed
Show file tree
Hide file tree
Showing 16 changed files with 705 additions and 50 deletions.
50 changes: 50 additions & 0 deletions .release-notes/4547.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
## Add support for pinning actors to a dedicated scheduler thread

Pony programmers can now pin actors to a dedicated scheduler thread. This can be required/used for interfacing with C libraries that rely on thread local storage. A common example of this is graphics/windowing libraries.

The way it works is that an actor can request that it be pinned (which may or may not happen immediately) and then it must wait and check to confirm that the pinning was successfully applied (prior to running any workload that required the actor to be pinned) after which all subsequent behaviors on that actor will run on the same scheduler thread until the actor is destroyed or the actor requests to be unpinned.

### Caveat

Due to the fact that Pony uses cooperative scheduling of actors and that all pinned actors run on a single shared scheduler thread, any "greedy" actors that monopolize the cpu (with long running behaviors) will negatively inmpact all other pinned actors by starving them of cpu.

### Example program

```pony
// Here we have the Main actor that upon construction requests a PinUnpinActorAuth
// token from AmbientAuth and then requests that it be pinned. It then recursively
// calls the `check_pinned` behavior until the runtime reports that it has
// successfully been pinned after which it starts `do_stuff` to do whatever
// work it needs to do that requires it to be pinned. Once it has completed all
// of its work, it calls `done` to request that the runtime `unpin` it.
use "actor_pinning"
actor Main
let _env: Env
let _auth: PinUnpinActorAuth
new create(env: Env) =>
_env = env
_auth = PinUnpinActorAuth(env.root)
ActorPinning.request_pin(_auth)
check_pinned()
be check_pinned() =>
if ActorPinning.is_successfully_pinned(_auth) then
// do stuff that requires this actor to be pinned
do_stuff(10)
else
check_pinned()
end
be do_stuff(i: I32) =>
if i < 0 then
done()
else
do_stuff(i - 1)
end
be done() =>
ActorPinning.request_unpin(_auth)
```
81 changes: 81 additions & 0 deletions packages/actor_pinning/actor_pinning.pony
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
"""
# Actor Pinning Package
The Actor Pinning package allows Pony programmers to pin actors to a dedicated
scheduler thread. This can be required/used for interfacing with C libraries
that rely on thread local storage. A common example of this is graphics/windowing
libraries.
The way it works is that an actor can request that it be pinned (which may or
may not happen immediately) and then it must wait and check to confirm that the
pinning was successfully applied (prior to running any workload that required the
actor to be pinned) after which all subsequent behaviors on that actor will run
on the same scheduler thread until the actor is destroyed or the actor requests
to be unpinned.
## Example program
```pony
// Here we have the Main actor that upon construction requests a PinUnpinActorAuth
// token from AmbientAuth and then requests that it be pinned. It then recursively
// calls the `check_pinned` behavior until the runtime reports that it has
// successfully been pinned after which it starts `do_stuff` to do whatever
// work it needs to do that requires it to be pinned. Once it has completed all
// of its work, it calls `done` to request that the runtime `unpin` it.
use "actor_pinning"
actor Main
let _env: Env
let _auth: PinUnpinActorAuth
new create(env: Env) =>
_env = env
_auth = PinUnpinActorAuth(env.root)
ActorPinning.request_pin(_auth)
check_pinned()
be check_pinned() =>
if ActorPinning.is_successfully_pinned(_auth) then
// do stuff that requires this actor to be pinned
do_stuff(10)
else
check_pinned()
end
be do_stuff(i: I32) =>
if i < 0 then
done()
else
do_stuff(i - 1)
end
be done() =>
ActorPinning.request_unpin(_auth)
```
## Caveat
Due to the fact that Pony uses cooperative scheduling of actors and that all
pinned actors run on a single shared scheduler thread, any "greedy" actors that
monopolize the cpu (with long running behaviors) will negatively inmpact all
other pinned actors by starving them of cpu.
"""

use @pony_actor_set_pinned[None]()
use @pony_actor_unset_pinned[None]()
use @pony_scheduler_index[I32]()

primitive ActorPinning
fun request_pin(auth: PinUnpinActorAuth) =>
@pony_actor_set_pinned()

fun request_unpin(auth: PinUnpinActorAuth) =>
@pony_actor_unset_pinned()

fun is_successfully_pinned(auth: PinUnpinActorAuth): Bool =>
let sched: I32 = @pony_scheduler_index()

// the `-999` constant is the same value as `PONY_PINNED_ACTOR_THREAD_INDEX`
// defined in `scheduler.h` in the runtime
sched == -999
3 changes: 3 additions & 0 deletions packages/actor_pinning/auth.pony
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
primitive PinUnpinActorAuth
new create(from: AmbientAuth) =>
None
6 changes: 6 additions & 0 deletions packages/builtin/runtime_options.pony
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,12 @@ struct RuntimeOptions
Requires `--ponypin` to be set to have any effect.
"""

var ponypinpinnedactorthread: Bool = false
"""
Pin the pinned actor thread to a CPU the way scheduler threads are pinned to CPUs.
Requires `--ponypin` to be set to have any effect.
"""

var ponyprintstatsinterval: U32 = -1
"""
Print actor stats before an actor is destroyed and print scheduler stats
Expand Down
20 changes: 19 additions & 1 deletion src/libponyrt/actor/actor.c
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ enum
FLAG_UNSCHEDULED = 1 << 3,
FLAG_CD_CONTACTED = 1 << 4,
FLAG_RC_OVER_ZERO_SEEN = 1 << 5,
FLAG_PINNED = 1 << 6,
};

enum
Expand Down Expand Up @@ -1046,7 +1047,7 @@ PONY_API void pony_sendv_single(pony_ctx_t* ctx, pony_actor_t* to,
{
// if the receiving actor is currently not unscheduled AND it's not
// muted, schedule it.
ponyint_sched_add_inject_or_sched(ctx, to);
ponyint_sched_add(ctx, to);
}
}
}
Expand Down Expand Up @@ -1219,6 +1220,23 @@ PONY_API void pony_triggergc(pony_ctx_t* ctx)
ctx->current->heap.next_gc = 0;
}

bool ponyint_actor_is_pinned(pony_actor_t* actor)
{
return has_internal_flag(actor, FLAG_PINNED);
}

PONY_API void pony_actor_set_pinned()
{
pony_ctx_t* ctx = pony_ctx();
set_internal_flag(ctx->current, FLAG_PINNED);
}

PONY_API void pony_actor_unset_pinned()
{
pony_ctx_t* ctx = pony_ctx();
unset_internal_flag(ctx->current, FLAG_PINNED);
}

void ponyint_become(pony_ctx_t* ctx, pony_actor_t* actor)
{
ctx->current = actor;
Expand Down
6 changes: 6 additions & 0 deletions src/libponyrt/actor/actor.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,12 @@ gc_t* ponyint_actor_gc(pony_actor_t* actor);

heap_t* ponyint_actor_heap(pony_actor_t* actor);

bool ponyint_actor_is_pinned(pony_actor_t* actor);

PONY_API void pony_actor_set_pinned();

PONY_API void pony_actor_unset_pinned();

bool ponyint_actor_pendingdestroy(pony_actor_t* actor);

void ponyint_actor_setpendingdestroy(pony_actor_t* actor);
Expand Down
4 changes: 4 additions & 0 deletions src/libponyrt/options/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@
" --ponypinasio Pin the ASIO thread to a CPU the way scheduler\n" \
" threads are pinned to CPUs. Requires `--ponypin` to\n" \
" be set to have any effect.\n" \
" --ponypinpinnedactorthread\n" \
" Pin the pinned actor thread to a CPU the way scheduler\n" \
" threads are pinned to CPUs. Requires `--ponypin` to\n" \
" be set to have any effect.\n" \
" --ponyprintstatsinterval\n" \
" Print actor stats before an actor is destroyed and\n" \
" print scheduler stats every X seconds. Defaults to -1 (never).\n" \
Expand Down
18 changes: 15 additions & 3 deletions src/libponyrt/sched/cpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -227,9 +227,10 @@ uint32_t ponyint_cpu_count()
}

uint32_t ponyint_cpu_assign(uint32_t count, scheduler_t* scheduler,
bool pin, bool pinasio)
bool pin, bool pinasio, bool pinpat)
{
uint32_t asio_cpu = -1;
uint32_t pat_cpu = -1;

if(!pin)
{
Expand All @@ -255,11 +256,12 @@ uint32_t ponyint_cpu_assign(uint32_t count, scheduler_t* scheduler,
if(pinasio)
asio_cpu = avail_cpu_list[count % avail_cpu_count];

if(pinpat)
pat_cpu = avail_cpu_list[(count + 1) % avail_cpu_count];

ponyint_pool_free_size(avail_cpu_size * sizeof(uint32_t), avail_cpu_list);
avail_cpu_list = NULL;
avail_cpu_count = avail_cpu_size = 0;

return asio_cpu;
#elif defined(PLATFORM_IS_BSD)
// FreeBSD does not currently do thread pinning, as we can't yet determine
// which cores are hyperthreads.
Expand All @@ -269,6 +271,9 @@ uint32_t ponyint_cpu_assign(uint32_t count, scheduler_t* scheduler,
if(pinasio)
asio_cpu = count % hw_cpu_count;

if(pinpat)
pat_cpu = (count + 1) % hw_cpu_count;

for(uint32_t i = 0; i < count; i++)
{
scheduler[i].cpu = i % hw_cpu_count;
Expand All @@ -286,8 +291,15 @@ uint32_t ponyint_cpu_assign(uint32_t count, scheduler_t* scheduler,
// asio_cpu of -1
if(pinasio)
asio_cpu = count;

if(pinpat)
pat_cpu = (count + 1);
#endif

// set the affinity of the current thread (nain thread) which is the pinned
// actor thread
ponyint_cpu_affinity(pat_cpu);

return asio_cpu;
}

Expand Down
2 changes: 1 addition & 1 deletion src/libponyrt/sched/cpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ void ponyint_cpu_init();
uint32_t ponyint_cpu_count();

uint32_t ponyint_cpu_assign(uint32_t count, scheduler_t* scheduler,
bool nopin, bool pinasio);
bool nopin, bool pinasio, bool pinpat);

void ponyint_cpu_affinity(uint32_t cpu);

Expand Down
Loading

0 comments on commit 9d178ed

Please sign in to comment.