Add clock vectors #16

bartoszmodelski · 2023-03-13T21:44:23Z

The core of DSCheck is closely following the design introduced in the DPOR paper, except for clock vectors, which are missing. This PR adds them.

Motivation

We want to be able to track happens before relationship as it is critical to an efficient search. For example, consider the following program:

Thread A 
a1: write x=1
a2: write y=2 

Thread B 
b1: write y=1 
b2: write x=2

If we were to do a full search, we'd explore 6 traces:

a1 a2 b1 b2 
a1 b1 a2 b2 
a1 b1 b2 a2 
b1 b2 a2 a2
b1 a1 b2 a2 
b1 a1 a2 b2

But many of these trace are redundant as far as the terminal state is concerned. For example, consider: a1 b1 a2 b2. Reordering of either the first two or the last two transitions leads to the same result (x=2, y=2). In fact, the only three possible terminal states are (x=1, y=2), (y=1, x=2), (x=2, y=2) and as long as these are covered, so are all possible realizations.

Results

There's a significant improvement on tests with multiple variables:

Artificial test. Number of traces goes down from 31 to 5.
Michael-Scott queue. The runtime on tests with 4 items goes down from 0.2 s to ~0.001 s. Eyeballing it, previously adding a single item increased the runtime 7-fold, while now it doubles it.

There's a 10-15% slowdown on test_list, which has all threads focus on a single variable, due to cost of extra accounting. If we don't find any further optimizations here in the future, we can always add a flag for fast-path here.

test_naive_counter is too small to be affected.

Design

I followed the section 4.2 and figure 5 in DPOR paper. It was not that trivial due to sheer amount of notation but it should be easier to verify than to translate for the first time.

I think the best resource is the paper itself. In part because I'm not an expert in this area, and another pair of eyes looking at this could potentially discover things I've missed (even in terms of building understanding). Having said that, I'm super happy to jump on a call or try to explain more here.

I'd recommend a quick read of sections 1-3, and spending more time on section 4. The paper explains vector clocks but I've supported myself with Martin Kleppmann's excellent short lecture on them.

art-w · 2023-03-14T10:41:51Z

tests/test_disjoint.ml

+          Atomic.set a 2)
+      |> ignore;
+
+      Atomic.final (fun () -> ()));


It should help to insert some debug print here:

Atomic.final (fun () -> Format.printf "a=%i b=%i c=%i d=%i@." (Atomic.get a) (Atomic.get b) (Atomic.get c) (Atomic.get d)));

and investigate why it outputs these terminal states:

a=2 b=2 c=2 d=2 a=2 b=2 c=2 d=2 a=2 b=2 c=2 d=1 a=2 b=2 c=2 d=2 a=2 b=2 c=2 d=1

rather than:

a=2 b=2 c=2 d=2 a=2 b=2 c=2 d=1 a=2 b=2 c=1 d=1 a=2 b=1 c=1 d=1 a=1 b=1 c=1 d=1

bartoszmodelski · 2023-03-16T14:52:31Z

Ok, I think I understand what's going on. The mechanism with vector clocks is built for tracking a much stronger relationship than we have between SC atomics. That is, happens-before occurs due to program order and in synchronization points, e.g. thread spawn, thread join, locks, etc. These are the only places where are allowed to merge clocks of two threads (max them and update the clock of thread that was waiting if I understand correctly). It was not quite obvious from the DPOR paper since they only make this specialization in section 4 but CDSCheck one makes it more explicit.

We should add support for join, locks, etc. at some point to be able to verify a wider range of real programs but for now I'm going to hold off with this change.

Thanks @art-w for help with this example!

bartoszmodelski added 4 commits March 13, 2023 17:15

add test with multiple locations

6097426

clock vector

5979b37

refactor 1

be0bf24

refactor 2

127f70f

bartoszmodelski requested a review from a team March 13, 2023 21:46

art-w reviewed Mar 14, 2023

View reviewed changes

bartoszmodelski removed the request for review from a team March 16, 2023 14:52

bartoszmodelski marked this pull request as draft March 16, 2023 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add clock vectors #16

Add clock vectors #16

bartoszmodelski commented Mar 13, 2023 •

edited

Loading

art-w Mar 14, 2023

bartoszmodelski commented Mar 16, 2023

Add clock vectors #16

Are you sure you want to change the base?

Add clock vectors #16

Conversation

bartoszmodelski commented Mar 13, 2023 • edited Loading

Motivation

Results

Design

art-w Mar 14, 2023

Choose a reason for hiding this comment

bartoszmodelski commented Mar 16, 2023

bartoszmodelski commented Mar 13, 2023 •

edited

Loading