Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update array API tests without adding complex number support #37

Merged
merged 33 commits into from
Sep 3, 2024

Conversation

cbourjau
Copy link
Collaborator

@cbourjau cbourjau commented Aug 1, 2024

This PR updates the array API test suite to the latest upstream version plus a minor custom fix that is in the process of being upstreamed (upstream merged the fix). The fix allows us to run the test suite despite currently not having support for complex numbers, yet (see #32 for more information on adding those).

The skips.txt file contains failures that cause segfaults in the onnxruntime. xfails.txt contains test names which fail for other reasons. We should triage both of these files.

Edit: Upated the description to reflect the latest circumstance of this PR

@adityagoel4512
Copy link
Member

Does this PR supersede #33?

@adityagoel4512
Copy link
Member

adityagoel4512 commented Aug 1, 2024

#39 will address the 1 failure. The ONNX standard doesn't support pow on int8 or int16.

@cbourjau cbourjau changed the title Update array api tests without adding complex number suppoert Update array API tests without adding complex number support Aug 1, 2024
@cbourjau
Copy link
Collaborator Author

cbourjau commented Aug 1, 2024

Does this PR supersede #33?

Yes! Sorry for the confusion!

skips.txt Outdated Show resolved Hide resolved
@cbourjau cbourjau marked this pull request as draft August 12, 2024 13:50
@cbourjau cbourjau force-pushed the update-api-tests-without-complex branch from ae388b0 to 455e502 Compare August 12, 2024 22:53
@cbourjau
Copy link
Collaborator Author

cbourjau commented Aug 13, 2024

Brief update:

The latest test suite runs quite fast, allowing us to run on many more hypothesis examples than previously (100 instead of 2 at the time of writing). This did surface some further issues. In particular, I discovered that we can trigger segfaults in onnx from the test_roll (example) and test_broadcast_arrays (example) tests.

@cbourjau cbourjau mentioned this pull request Aug 13, 2024
@adityagoel4512
Copy link
Member

adityagoel4512 commented Aug 13, 2024

Brief update:

The latest test suite runs quite fast, allowing us to run on many more hypothesis examples than previously (100 instead of 2 at the time of writing). This did surface some further issues. In particular, I discovered that we can trigger segfaults in onnx from the test_roll (example) and test_broadcast_arrays (example) tests.

Nice, thanks! If it's faster now it would be great to have skips.txt exclusively contain any tests that segfault upstream rather than assertion fails.

@cbourjau
Copy link
Collaborator Author

The full list of failures as of this writing can be found in the CI of this commit.
As suggested, the failures that may cause segfaults are in skips.txt; all others are in xfails.txt. Nonetheless, I didn't configure the test suite to hard-fail on XPasses (i.e. xfails that pass unexpectedly) since some of them are flakey even with the higher number of examples.

Copy link
Member

@adityagoel4512 adityagoel4512 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few XPASSes.

xfails.txt Outdated Show resolved Hide resolved
xfails.txt Show resolved Hide resolved
xfails.txt Show resolved Hide resolved
xfails.txt Outdated Show resolved Hide resolved
xfails.txt Outdated Show resolved Hide resolved
xfails.txt Outdated Show resolved Hide resolved
xfails.txt Show resolved Hide resolved
xfails.txt Show resolved Hide resolved
xfails.txt Outdated Show resolved Hide resolved
@cbourjau
Copy link
Collaborator Author

There are a few XPASSes.

I'm afraid some of them are just flaky. I generated these files from observed failures.

pixi.toml Outdated Show resolved Hide resolved
@cbourjau
Copy link
Collaborator Author

cbourjau commented Sep 3, 2024

I am now running the array-api-tests using a fixed seed. Thus, we should no longer get flaky XPASSes. I think this is now ready to merge even if we xfail a lot of tests liberally. We should address these in follow up PRs, IMHO.

@cbourjau cbourjau marked this pull request as ready for review September 3, 2024 14:50
@cbourjau
Copy link
Collaborator Author

cbourjau commented Sep 3, 2024

Apparently the tests are still a bit flaky :( The XFAILS you just added were passing previously: https://github.com/Quantco/ndonnx/actions/runs/10677594899/job/29593026155#step:5:1263

@adityagoel4512
Copy link
Member

adityagoel4512 commented Sep 3, 2024

Apparently the tests are still a bit flaky :( The XFAILS you just added were passing previously: https://github.com/Quantco/ndonnx/actions/runs/10677594899/job/29593026155#step:5:1263

I'm not sure how much marginal value there is in 100 iterations of each test compared to around 20. I'd rather make it 20 and have super snappy CI again.

Copy link
Member

@adityagoel4512 adityagoel4512 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the number of repetitions to 16 in the interest of having sub 10 minute CI. I think we can increase it once we've pruned down the list to catch more obscure edge cases but there is a reasonable amount to do as is. The added xfails are a little concerning but we can come back to them in a future PR.

@cbourjau cbourjau merged commit 75de95e into main Sep 3, 2024
16 checks passed
@cbourjau cbourjau deleted the update-api-tests-without-complex branch September 3, 2024 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants