-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the duration provider pattern #86
Introduce the duration provider pattern #86
Conversation
9324ab6
to
f88f64f
Compare
f88f64f
to
ec406db
Compare
@@ -27,6 +27,8 @@ Naming/FileName: | |||
- lib/flatware-*.rb | |||
Style/Documentation: | |||
Enabled: false | |||
Metrics/AbcSize: | |||
Max: 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change has added a bit of extra AbcSize so RUbocop started failing.
flatware/lib/flatware/rspec/cli.rb
Lines 25 to 26 in ec406db
duration_provider = RSpec::DurationProviders.lookup(options['duration-provider']) | |
jobs = RSpec.extract_jobs_from_args rspec_args, workers: workers, duration_provider: duration_provider |
I looked to see if there is any meaningful refactoring opportunity here, but even a few deliberate changes (e.g. just moving lines around, extracting and putting it in a different file, etc) didn't get the Rubocop check passing. I don't think I can come up with meaningfully refactored code out of this low AbsSize setting, so I just increased the size.
I know people have opinions about this, but this is becoming a bit of a barrier to new contributors...
@yuki24 Thanks for your work on this! Curious, have you measured timing for that giant suite with and without time balanced jobs. I added the feature because other parallel runners had it and it seemed fun to write, but I haven't found a suite in the wild where time balancing has a consistent speedup effect. If so, I'd also be interested in removing the use of Plus there are issues with So time balancing just doesn't work if you have a custom boot. And I have consistently found, so far, that a custom boot saves more time than time balancing. If this was a config option like In short, since you're the first user I've spoken to who finds the time balancing feature useful, I'd rather see it changed to meet your needs then introduce an adapter to support a, in my experience, pretty useless version. Is that something you'd be interesting in building? |
@briandunn Thanks for your thoughtful reply! Time balancing has definitely been useful for not just one but a handful of relatively large projects I have worked on in the past. 80k is probably the largest, but even for 10-20k evenly distributing tests has always cut 1-7min of test execution time, depending on how long larger spec files took. I would think that ~1k specs may not be large enough for time balancing to be useful. The problem is, RSpec can only distribute tests by file, not by example. So if a large spec file that takes e.g. 5min happened to run at the end of the suite in one of the workers, only a single worker (or a CPU) would be busy while the rest of the CPUs would be doing nothing for the duration of the last execution. Even with time balancing, we have consistently observed test imbalance, resulting in 1-2min of extra execution time. In order to solve all this, we have also implemented a centralized queue based off of
By executing from 1 to 3 we have been able to reduce even the last bit of imbalance, which was around 1-2min. Of course, this comes with a few caveats where specs are not guaranteed to be as independent as complete randomization, etc. However, we think this is a reasonable trade-off given that reducing 1min, with 90 distributed workers executing 80k examples, is really, really difficult. In summary, we wanted to do more than just introducing a new pattern. I know this is a lot to digest, so I would like to learn more about what you think about this. Any feedback is appreciated. |
Closing this PR as I think this new pattern wouldn't be useful for both of us regardless. We will experiment more within our app and get back (or start a new conversation) once we feel good about our experiments. Thank you! |
I did not know time balancing offered such gains that's awesome. The original design for flatware held a queue in a "Ventilator". When a worker finished one
I think the responsibility could be moved to the I would be interested in reintroducing this strategy if it can be done reliably. @yuki24 Do you think such an approach would be helpful in your use case? |
@briandunn Apologies for the delay. I did not know that there was a name for what we implemented! We are still experimenting with this implementation, but overall we are more than happy to give back to the community, and we will get back to you as soon as we feel comfortable with it. (1) is definitely an issue, as we had to hack around RSpec core. We have a working PoC in a private repo, but we aren't extremely happy with the hack for forcefully resetting
This is exactly what we did. It would bring in extra complexity for sure. However, personally, this is a much smaller concern to me as the added complexity is small enough and quite manageable. |
First off, thanks so much for crafting such a fantastic gem! I’m a big fan of how
flatware
handles the forking and dRuby communication—it really makes a difference.Background
We’ve started using flatware in one of our projects and hit a snag with the RSpec example status persistence file. It got massive—over 500MB, due to the existence of more than 80,000 examples—and it was slowing down our tests almost more than just running them!
We tried a workaround with
parallel_tests
'ParallelTests::RSpec::RuntimeLogger
, which trimmed the data down to just 500KB (mostly because it only tracks the execution time for each file, rather than for each example.) We also had to monky-patch theJobBuilder
a bit to make it work, but it felt a bit like a hack and might be a pain to maintain.What This PR is about
So, I'm proposing a new "Duration Provider" pattern in this pull request. It would let us and other users customize how duration calculations are done without messing too much with
JobBuilder
. This should make things more flexible and cleaner overall.Here’s a peek at the custom provider we whipped up:
We then use this command to run the tests:
$ # The bin/flatware file is also our own executable that `require`s the provider above: $ bin/flatware rspec -d parallel_tests_runtime ...
I think this could really streamline how we manage custom durations and keep things neat. Let me know what you all think.
/cc @zzak @kei-s