Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drt: Contiguous GC data #6016

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kbieganski
Copy link
Contributor

Store some of the GC data in contiguous arrays. The last commit is to ensure no one pushes to these vectors, as other code references this data via pointers, so we cannot invalidate them.

asap7/aes

Branch Min [s] Max [s] Mean [s] Relative mean Median [s] Relative median
baseline 400.30 401.20 400.77 ± 0.27 1.03 400.83 1.03
drt-contiguous-gc 390.57 390.82 390.66 ± 0.08 1.00 390.63 1.00

asap7/ibex

Branch Min [s] Max [s] Mean [s] Relative mean Median [s] Relative median
baseline 570.74 571.73 571.13 ± 0.31 1.03 571.08 1.03
drt-contiguous-gc 555.77 557.53 556.71 ± 0.50 1.00 556.69 1.00

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/drt/src/db/dynarray.h Outdated Show resolved Hide resolved
src/drt/src/db/dynarray.h Outdated Show resolved Hide resolved
src/drt/src/db/dynarray.h Outdated Show resolved Hide resolved
src/drt/src/db/dynarray.h Outdated Show resolved Hide resolved
@kbieganski
Copy link
Contributor Author

@osamahammad21 Have you had a chance to look at this?

@maliberty
Copy link
Member

Please cleanup the clang-tidy warnings

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@osamahammad21
Copy link
Member

I am still looking into the PR but I don't understand the need for the dynarray class. Is it just a container for the std::vector? Why is it better than just using std::vector directly?

@osamahammad21
Copy link
Member

I have concerns about the dynarray class. It doesn’t feel dynamic, as it prevents adding objects to the vector without invalidating pointers both in the objects themselves and the region query. While we don’t currently add objects this way, it’s something we might need in the future. Using unique_ptrs was more intuitive and developer-friendly, and the gain here is modest at just 2.5%. Since we rely heavily on polymorphism, this approach likely won’t be generalized in DRT, making the trade-off less compelling. Is there a stronger or longer-term motive for merging this PR that we might be overlooking?

@kbieganski
Copy link
Contributor Author

I should've outline some of this in the description, apologies.

The whole point of dynarray is to prevent adding objects to the vector. This is to prevent invalidating pointers to its contents. I'd rather those be indices, but that would be too big of a change for one PR. I'd also prefer to just have an array instead of a vector there, but again, some places required too many changes for a single PR.

As for a longer term motive, this is a change to the structure of a small subset of the data that DRT operates on. One could apply similar changes to the rest of the data, and stack these speedups.

I get that you use polymorphism and you like it, but I encourage you to reconsider using it everywhere. It's dynamic in some ways, but static and constraining in others. Not to mention it requires pointer chasing and fragmented memory access. Can you point out what problems it solves for you? I also fail to see how arrays of pointers are more intuitive than just arrays.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@kbieganski kbieganski force-pushed the drt-continuous-gc branch 2 times, most recently from 7b2f473 to af0a213 Compare January 20, 2025 16:26
Comment on lines 142 to 146
Arena<gcNet> netArena_ = {128};
Arena<gcPin> pinArena_ = {512};
Arena<gcRect> rectArena_ = {512};
Arena<gcSegment> segmentArena_ = {512};
Arena<gcCorner> cornerArena_ = {512};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any comment why these exact block sizes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason. The new version I just pushed I tested empirically. It's around the optimum for the systems I tested on.

@osamahammad21
Copy link
Member

@kbieganski Do you still get the same speedup with the arena data structure?

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/drt/src/db/Arena.h Outdated Show resolved Hide resolved
src/drt/src/db/Arena.h Outdated Show resolved Hide resolved
@kbieganski
Copy link
Contributor Author

Yes, it's about 3.5% for asap7/ibex, asap7/aes, and a secret design on asap7.
Is this a more acceptable design?

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/drt/src/db/Arena.h Outdated Show resolved Hide resolved
Signed-off-by: Krzysztof Bieganski <[email protected]>
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@kbieganski
Copy link
Contributor Author

@osamahammad21 What do you think about this approach?

@maliberty
Copy link
Member

Arena appears to be not much different than std::deque. It doesn't support deallocation except as a whole. It makes the memory management less clear as we have to ensure nobody tries to delete such an arena returned pointer. Are you seeing more than 3% gain in private testing? If not I don't find the complexity runtime tradeoff compelling.

@kbieganski
Copy link
Contributor Author

Arena appears to be not much different than std::deque.

It allocates bigger chunks (technically depending on std implementation), and I guess the intent is different. Sometimes there's a slot reuse mechanism, but it wouldn't be useful here.

It doesn't support deallocation except as a whole.

That's a big plus if you know the lifetime of your data. And you know it in this situation.

It makes the memory management less clear as we have to ensure nobody tries to delete such an arena returned pointer.

On master, there are raw pointers obtained from unique_ptr::get() all over the place, does that not apply there?

Are you seeing more than 3% gain in private testing?

No, but if you apply similar changes to other places, I believe it'll add up. I'll try a bigger design.

If not I don't find the complexity runtime tradeoff compelling.

I would argue it's a matter of familiarity rather than complexity. And arenas are a well-documented concept, shouldn't be too difficult to bring devs up to speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants