Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dependency chain depth as a secondary topological sort key #293

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kjaget
Copy link

@kjaget kjaget commented Aug 8, 2020

This PR adds code to add heuristics to speed up catkin builds.

The background - our code has a several dependency chains, and due to alphabetical sorting, some of the prereqs don't get built until late in the list of packages. This often leads to the case where these prereqs are the only builds running. Unrelated packages are done and the remaining packages depend on the one package building. This hurts parallelism and slows down our builds.

Our case might be unusual since we're building natively on embedded hardware, so we're probably not a normal use case. With this PR we save about 1.5 minutes on a clean build, or ~15%. The PR should help when running on faster hardwae even if the magnitude of savings won't be as impressive.

The PR adds additional sort criteria to the topological sort. The primary is still correctness w.r.t. dependencies. The new criteria are used each iteration of the sort to pick from the list of packages which have their dependencies satisfied.

First off, it calculates the length of the longest dependency chain between a given package and all of the paths which lead to leaf nodes in the tree (the packages which are not prereqs for anything). The package with the longest set of dependent packages between itself and its leaf nodes is picked as the highest priority to run next. The idea here is that it makes sense to build a package at the start of a long chain of packages as soon as possible.

The tiebreaker among packages with equal max dependency chain depth is a count of how many packages depend on the current package. The goal here is that finishing a package which removes dependencies adds more opportunity for parallelism in the build.

Neither of these are guaranteed to be optimal but they seem like reasonable heuristics.

I looked and didn't see any other users of the topological sort functions so I don't think this will change the behavior of anything besides catkin build. Let me know if that's wrong.

Also, apologies in advance for the "translating C++ code into Python in my head" approach to coding and welcome feedback there.

@kjaget
Copy link
Author

kjaget commented Aug 8, 2020

Ugh, will work to get it to pass testcases and clean up comment formatting

@mikepurvis
Copy link
Contributor

mikepurvis commented Aug 9, 2020

Big +1 on this. A discussion about this concept was had in catkin/catkin_tools#372; glad to see it coming up again.

FYI @tspicer01, this may help with achieving greater parallelism in our builds.

@kjaget
Copy link
Author

kjaget commented Aug 9, 2020

Glad to see I'm not the only one who noticed this.

For people who want to try this at home, note that catkin/catkin_tools#626 will be needed to see the full effect.

Code at least passes old testcases and has new ones added for the functions which calculate the secondary / tertiary / N+1 sort criteria.

@dirk-thomas
Copy link
Member

dirk-thomas commented Sep 28, 2020

Since this isn't used by / beneficial to any of the currently supported tools (catkin_make, catkin_make_isolated, colcon) I might not be able to review this in the near future. Maybe on of the catkin_tools maintainers @catkin/catkin-tools-dev (no idea why this doesn't work so enumerating manually: @wjwwood, @mikepurvis, @jbohren) can review this and make sure it suits their needs as well as doesn't introduce any regressions into other tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants