Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite test_select to run in a few seconds. #1665

Merged
merged 9 commits into from
Oct 3, 2023

Conversation

gpx1000
Copy link
Contributor

@gpx1000 gpx1000 commented Mar 7, 2023

Test select getting optimized for runtime.

test_conformance/select/test_select.cpp Outdated Show resolved Hide resolved
test_conformance/select/test_select.cpp Outdated Show resolved Hide resolved
@lakshmih
Copy link
Contributor

Need more time to review this.

@bcalidas
Copy link

There are major concerns with this PR. I would suggest a complete re-evaluation by the author.

@gpx1000
Copy link
Contributor Author

gpx1000 commented Mar 24, 2023

Sorry for that; I will naturally re-evaluate and re-submit when I have a better version.

@bashbaug
Copy link
Contributor

Removing "focused review" until the review issues are addressed - thanks!

# Conflicts:
#	test_conformance/select/test_select.cpp
…le change report on each change's cost savings:

 BEFORE:
        real    47m8.497s
        user    48m8.860s
        sys     0m14.952s
AFTER:
        real    17m53.383s
        user    18m53.342s
        sys     0m13.297s

initSrcBuffer generates the same random noise every iteration through the loop.  There is no change to the arguments, and the host data itself doesn't need to get rewritten.  Profiling realizes a 2 times speed accel from simply relying upon the buffer to remain randomized at the next loop iteration.
        real    17m53.383s
        user    18m53.342s
        sys     0m13.297s
AFTER:
real    12m26.035s
user    13m15.505s
sys     0m15.414s

rearrange a few things in the loops to allow for vectorized / interleaved loop traversal.  NB: not all loops are vectorizable obviously; but this addresses the worst offenders.  Also note, to enable compiler to generate vectorized and interleaved loop traversal build with -o3.
lakshmih
lakshmih previously approved these changes Sep 18, 2023
bashbaug
bashbaug previously approved these changes Sep 25, 2023
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this doesn't quite reduce execution time to "a few seconds" but I'm definitely seeing a nice improvement and there doesn't seem to be any lost coverage.

I tested one of our devices with the command line test_select select_int_int.

For the old version, before these changes:

$ time ./test_conformance/select/test_select select_int_int
...
real	1m19.207s
user	1m9.578s
sys	0m11.316s

For the new version, with these changes:

real	0m33.572s
user	0m23.800s
sys	0m11.626s

Thanks!

Note, there is a merge conflict that needs to be resolved before merging.

# Conflicts:
#	test_conformance/select/test_select.cpp
@gpx1000 gpx1000 dismissed stale reviews from bashbaug and lakshmih via a75cde5 September 26, 2023 01:04
@gpx1000
Copy link
Contributor Author

gpx1000 commented Sep 26, 2023

I'm reusing the PR from the first attempt at this. The first attempt does complete in a few seconds but at the expense of losing the number of test cases. I merged / resolved the merge conflict.

@bashbaug
Copy link
Contributor

bashbaug commented Oct 3, 2023

Merging as discussed in the October 3rd teleconference.

@bashbaug bashbaug merged commit 43d6886 into KhronosGroup:main Oct 3, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants