Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add systematic tests #1072

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

simsurace
Copy link
Contributor

In order to find possible failure modes of Enzyme, the idea of this PR is to iterate over "all" functions in Base and stdlibs and test whether Enzyme gives the correct derivatives. Currently, it is probably noisy and pretty limited (also to save CI time):

  • It only includes LinearAlgebra
  • It is limited to 1- and 2-argument functions
  • It only checks functions that either take scalars, vectors, or matrices and return real numbers of arrays of real numbers (excluding Bool).

Still, it may already catch some interesting failures.

@codecov-commenter
Copy link

codecov-commenter commented Sep 23, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8981a34) 74.34% compared to head (d530851) 93.28%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1072       +/-   ##
===========================================
+ Coverage   74.34%   93.28%   +18.93%     
===========================================
  Files          35        8       -27     
  Lines       10307      253    -10054     
===========================================
- Hits         7663      236     -7427     
+ Misses       2644       17     -2627     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wsmoses
Copy link
Member

wsmoses commented Sep 24, 2023

@simsurace you mentioned some functionaity to automatically submit minimal tests (which were tracked) for current failures?

Is that possible to do (it would be immensely useful)?

@simsurace
Copy link
Contributor Author

Yeah I will think about how to do that. Basically, from the failed test it should be easy to generate code for a MRE. Maybe we first need to clarify if the current way to detect failures is the correct one. There seem to be hundreds of them even just in the current test set. Maybe you don't want thousands of open issues to sift through by hand (assuming we test other modules as well). So maybe right now the better option is still to open issues by hand (e.g. the eigmax one would be the first because it is fatal, i.e. it stops the whole test suite).

@wsmoses
Copy link
Member

wsmoses commented Sep 24, 2023

Maybe we could have a single issue with tasks that are auto populated/updated?

@simsurace
Copy link
Contributor Author

Great idea. I need to figure out how to do this.

@wsmoses
Copy link
Member

wsmoses commented Jan 27, 2024

bump @simsurace did you have a chance to work on this?

@simsurace
Copy link
Contributor Author

I think the whole automation involving auto-updating of GitHub issues is beyond by current capacity. Would the tests as they are now be useful as a non-blocking CI step? Or could we maybe extract a bunch of MREs by hand?

@simsurace
Copy link
Contributor Author

I've thought about it a bit. From my understanding, the steps that would need to be solved to make an automatic solution work:

  • Figure out how to extract all relevant information about the error as a string/text file
  • Figure out how to post/update an issue to the GitHub repo from within Julia or bash (including the string or text file from above). Maybe use an id that can be checked to avoid duplicate issues.

Then I think the loop over modules and functions could be changed such that it only runs until the first error, saves the whole error output to a file and then updates the issue with a code block that corresponds to the call that triggered the error and the file with the stacktrace attached.

I don't think any of this is very hard, but it's stuff I need time to figure out. So if someone has done it before, they would certainly be in a much better position to get this done quickly. Anyways I will try to remember this task when I do have some time available.

@gdalle
Copy link
Contributor

gdalle commented Jun 24, 2024

This might actually be easier with DifferentiationInterfaceTest, we could define scenarios for all Base functions and just use test_differentiation to run them.

See also #1563

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants