Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Side-effectful Nontermination [DRAFT] #772

Open
wants to merge 18 commits into
base: development
Choose a base branch
from

Conversation

ThomasHaas
Copy link
Collaborator

@ThomasHaas ThomasHaas commented Nov 7, 2024

This PR adds a new, improved encoding that can detect some cases of side-effectful non-termination.
I added 5 new benchmarks to specifically test this, and updated the verdicts of the existing liveness tests that are now correctly identified as FAIL rather than UNKNOWN.
The encoding is functional and replaces the old encoding for deadlocks/liveness issues.
So feel free to test this branch on some real code.

Important notes:

  • In the presence of side-effectful loops, we generally need to unroll the code at least twice (B=2) and possibly even more times to find liveness issues
  • For side-effectful loops, we can only find liveness issues but no proof of termination! This means that no previous UNKNOWN verdict will become a PASS with theses changes, only possibly a FAIL!

The reasons that this PR is marked as DRAFT are:

  • The new encoding might have issues with uninit reads (we have no test for this). Even the old one might have had issues here.
  • The new encoding currently does also detect liveness issue that are caused by assertion failures, which the old one did not. This is easy to fix but we also have no test cases for this yet.
  • The new encoding does not reason about non-termination due to control barriers at all. We have no tests involving non-termination due to CBs.
  • We can detect many liveness issues in some of the currently skipped litmus tests for Vulkan. However, the liveness issue is only present under weak progress, so if I remove the test from the skip list, at least the LitmusVulkanFairLivenessTest will fail. We would need to be able to enable tests only for certain progress models and skip them for others.
  • EDIT: Fairness of spuriously failing events like StoreExclusive is not considered so far. We also do not have tests for this scenario.

Because of points (1) and (4), I have not yet deleted the old code (it is unused though).

Updated expected values in LivenessTest for existing tests that now show liveness violations (from UNKNOWN to FAIL)
@ThomasHaas
Copy link
Collaborator Author

ThomasHaas commented Nov 7, 2024

The existing liveness tests seem to be correct, and the change from UNKNOWN to FAIL was not caused by side-effectful nontermination but by point (2) I mentioned above. The relaxation of some barriers caused an assertion failure, which then caused some thread to abort and not release its lock, which was then treated as liveness issue.
I will fix this and revert the changes to the tests.

EDIT: Fixed the issue and reverted expected outcome of some tests.

Fixed Nontermination verdict if nontermination was caused due to assertion failure.
Updated expected outcomes of LivenessTest caused by the above change.
Updated expected values in LivenessTest for existing tests that now show liveness violations (from UNKNOWN to FAIL)
Fixed Nontermination verdict if nontermination was caused due to assertion failure.
Updated expected outcomes of LivenessTest caused by the above change.
@hernanponcedeleon
Copy link
Owner

  • We can detect many liveness issues in some of the currently skipped litmus tests for Vulkan. However, the liveness issue is only present under weak progress, so if I remove the test from the skip list, at least the LitmusVulkanFairLivenessTest will fail. We would need to be able to enable tests only for certain progress models and skip them for others.

Can't we add a configurable (similar to several of our providers) that allows us to define a more fine-grained skip predicate and define the predicate accordingly based on the progress model?

@ThomasHaas
Copy link
Collaborator Author

We could have skip lists per progress model, but can't we instead just comment out the tests in the expected files? Those files are already defined on a per-progress-model basis.

@hernanponcedeleon
Copy link
Owner

Having skip list per model is not maintainable. If at some point we notice that e.g., the arch can also influence these things we need to do yet another level of splitting.

can't we instead just comment out the tests in the expected files?

As long as we do not lose the information we currently have (e.g., removing the lines is a "no go"), it should be fine.

@ThomasHaas
Copy link
Collaborator Author

Having skip list per model is not maintainable. If at some point we notice that e.g., the arch can also influence these things we need to do yet another level of splitting.

Using providers to skip the tests would effectively also just be a skiplist per model, but just written in code (unless you find a way to skip tests based on something other than their name).

@ThomasHaas
Copy link
Collaborator Author

ThomasHaas commented Nov 11, 2024

Little update: this branch can give wrong verdicts on some simple cases where SCCP eliminates loop counter variables and the loop termination condition entirely.

int i = 0;
while (i < 10) { i++; }

The above program is wrongly considered non-terminating if unrolled insufficiently.
The reason is that after unrolling + SCCP, the resulting code is essentially indistuinguishable from a while (true) loop.

EDIT: This is fixed now

@hernanponcedeleon
Copy link
Owner

I suspect that the problem is not SCCP, but rather DeadAssignmentElimination which removes side-effects and might not be really sound when we want to check liveness.

@ThomasHaas
Copy link
Collaborator Author

It's both actually. SCCP makes the assignments dead and even if they were not deleted I would like my code to not care about variables that are not live inside the loop (I have not implemented this yet, but I will eventually).
The solution is similar to what we already do: instrument the loop properly. Then SCCP/DAE can run as usual because the instrumentation will capture the necessary information for liveness detection.

…detection: this removes a wrong verdict related to SCCP removing the whole loop body.

Added new test related to above issue.
Minor updates to related code.
@ThomasHaas
Copy link
Collaborator Author

How necessary is it to support non-termination due to blocked ControlBarriers? We don't have any tests for this nor do we actually know how scheduling in the presence of blocked ControlBarriers work (the blocked thread cannot be rescheduled I guess).

@hernanponcedeleon
Copy link
Owner

The best would be to have support for non-termination for a given encoding of BlockedBarrier (whatever that one is). Correctly handling that is orthogonal to this PR (and actually related to #768).

@ThomasHaas
Copy link
Collaborator Author

Then I will copy over whatever we are doing right now for blocked control barriers into the new non-termination detection.

@hernanponcedeleon
Copy link
Owner

@ThomasHaas as you asked me, I added / enabled some more tests (I am still working on a test for fairness of store conditionals).

About the CADP tests: there are quite a few examples that are supposed to pass but we report a violation. IIRC, the expected results were automatically generated from the artifact of the forward progress paper based on the strong fairness conditions. Here is a link where one can easily check what are the expected results from their side.

@hernanponcedeleon
Copy link
Owner

Since we compile STLXR to weak RMWStoreExclusive + ExecutionStatus, and assuming we still do not consider the fairness of the former, shouldn't we report a liveness violation for this program?

AArch64 casloop
{
0:X0=x;
x=0;
}
 P0               ;
 L1:              ;
 LDAXR W1,[X0]    ;
 CMP W1,#1        ;
 B.NE L2          ;
 MOV W2,#2        ;
 STLXR W3,W2,[X0] ;
 CBNZ W3,L1       ;
 L2:              ;

@ThomasHaas
Copy link
Collaborator Author

Have you tried that program on this branch? On development we have a rough fix for fairness on store conditionals (we just don't have a test validating that).
Also, I'm not sure about the assembly code. Does the branch happen on successful comparison or failing one? Is the code really correct?

@hernanponcedeleon
Copy link
Owner

Have you tried that program on this branch? On development we have a rough fix for fairness on store conditionals (we just don't have a test validating that).

I did not remember we already had this: it seems to work, if I disable store fairness encoding, the new test fails.

Also, I'm not sure about the assembly code. Does the branch happen on successful comparison or failing one? Is the code really correct?

The first jump condition was wrong, it should have check that the value is different than zero. Fixed in the last commit.

@ThomasHaas
Copy link
Collaborator Author

The SPIRV barrier benchmark you added should be FAIL rather than PASS, no?
Could you also add a version that does not have the conditional, i.e., that uses the barrier correctly?

@hernanponcedeleon
Copy link
Owner

The SPIRV barrier benchmark you added should be FAIL rather than PASS, no?

Yes, my bad.

Could you also add a version that does not have the conditional, i.e., that uses the barrier correctly?

I slightly modified the code to change the expected result based on the configuration.

@ThomasHaas
Copy link
Collaborator Author

I slightly modified the code to change the expected result based on the configuration.

Hmm, while this is a safe version, I think it does not really exhibit waiting behaviour of the control barrier since there is nothing to wait for.
I would rather have a more general test here.

Btw. how do you generate the SPIRV code? Also, do you do the configurations manually or does the compiler also generate them? Maybe I can compile some code myself if I know how to.

@hernanponcedeleon
Copy link
Owner

Btw. how do you generate the SPIRV code? Also, do you do the configurations manually or does the compiler also generate them? Maybe I can compile some code myself if I know how to.

You need both the clspv compiler and SPIRV-Tools.

clspv benchmarks/opencl/NonUniformBarrier.cl --cl-std=CL3.0 --inline-entry-points --spv-version=1.6 -o kernel.spv
spirv-dis kernel.spv > kernel.spv.dis
spirv-opt --upgrade-memory-model kernel.spv.dis -o dartagnan/src/test/resources/spirv/basic/UniformBarrier-opt.spv.dis

and then manually add the configuration.

The last step was failing for this test so I simply change the memory model manually (OpMemoryModel Logical Vulkan).
However, for more complex test (e.g. where you need visibility/availability), doing changes manually is painful.

@hernanponcedeleon
Copy link
Owner

Hmm, while this is a safe version, I think it does not really exhibit waiting behaviour of the control barrier since there is nothing to wait for. I would rather have a more general test here.

BTW, even if the test seems trivial, it already shows a bug introduced in this branch. The kernel should deadlock if the WG has more than one thread. In this branch we (wrongly) report PASS while in development we get FAIL.

@ThomasHaas
Copy link
Collaborator Author

Yes, cause I didn't implement control barriers at all, i.e., there was not a single line of code handling them.
I fixed this locally already but I have no tests besides this trivial one.

Pure control-barrier nontermination actually requires nothing of the new theory: it is a standard reachability problem as no thread does anything anymore and thus is witnessed by standard finite executions.
The difficulty here is just ensuring that CBs are correctly waiting for each other for which we do not have good tests yet.

The most interesting type of non-termination would involve both barriers and loops: some threads get stuck in barriers while others are looping (possibly even with barriers inside the loop).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants