Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process state vulnerable to race conditions #75

Open
eriksoe opened this issue Jan 28, 2014 · 3 comments
Open

Process state vulnerable to race conditions #75

eriksoe opened this issue Jan 28, 2014 · 3 comments

Comments

@eriksoe
Copy link
Contributor

eriksoe commented Jan 28, 2014

As it stands, the process state is insufficiently thread-safe.
We need a clear set of invariants, and we need to enforce them.

The set of invariants should aparently include both pstate, the exit
hooks, and the pid-to-process binding.

One example of a race condition: simultaneous updates of EProc.pstate:

  • A: Process P1 is terminating normally
    • A1: P1.pstate = DONE
  • B: Process P2 is performing kill(P1,go_away)
    • B1: P1.process_incoming_exit()
    • B2: Check pstate (EXIT_SIG/SENDING_EXIT/DONE)?
    • B3: Set pstate = EXIT_SIG

Interleaving: ...,B2,A1,B3 => P1 ends up having terminated normally but with pstate==EXIT_SIG.

Another example: simultaneous updates of EProc.pstate:

  • A: Process P1 is starting up
    • A1: P1.execute1()
    • A2: P1.check_exit()
    • A3: P1.pstate = RUNNING
  • B: Process P2 is sending (e.g. propagating) an exit signal to P1
    • B1: P1.process_incoming_exit()
    • B2: Check pstate (EXIT_SIG/SENDING_EXIT/DONE)?
    • B3: Set pstate = EXIT_SIG

Interleaving: ...,A2,B3,A3 => P1 ends up having overlooked the exit signal.

Example involving exit hooks:

  • A: Process P1 is terminating
    • A1: P1.do_proc_termination()
    • A2: Take snapshot of P1.exit_hooks
    • A3: Pid1.done()
    • A4: Pid1.task = null (unsynchronized!)
  • B: Another process is adding an exit hook to P1 (e.g. ETS table ownership transfer)
    • B1: Check that Pid1.task() != null (==P1) (unsynchronized!)
    • B2: P1.add_exit_hook()
    • B3: Add hook to P1.exit_hooks

Interleaving: ...,A2,...,B1,...,A4,...,B3 => Exit hook is never called.

@eriksoe
Copy link
Contributor Author

eriksoe commented Jan 28, 2014

Demo of the exit-hook problem:

flood(N) ->
    Before = ets:all(),
    flood_loop(N),
    timer:sleep(1000),
    After = ets:all(),
    After -- Before.

flood_loop(0) -> ok;
flood_loop(N) when N>0 ->
    Pid = spawn(fun() -> ok end),
    %Tab = ets:new(foo, [{heir, Pid, here_you_are}]),
    Tab = ets:new(foo, []),
    try ets:give_away(Tab, Pid, here_you_are)
    catch _:badarg -> ets:delete(Tab)
    end,
    flood_loop(N-1).

flood(1000) returns [] on Erlang, as expected, but (often) a non-empty list on Erjang.

(As a bonus, the code triggers this race bug:
java.lang.NullPointerException at erjang.EInternalPID.is_alive(EInternalPID.java):
return task != null && task.is_alive();
when task is set to null by another thread.)

@eriksoe
Copy link
Contributor Author

eriksoe commented Jan 30, 2014

Process state rework is now done - it is in branch 'process-lifecycle-consistency', for the time being, pending review.

@krestenkrab
Copy link
Contributor

Is this the fix in #77 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants