Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Ptr and PtrList #30

Open
TannerRogalsky opened this issue Nov 8, 2022 · 2 comments
Open

Remove Ptr and PtrList #30

TannerRogalsky opened this issue Nov 8, 2022 · 2 comments

Comments

@TannerRogalsky
Copy link
Contributor

TannerRogalsky commented Nov 8, 2022

Ptr and PtrList are useful constructs but they don't exactly match the usage patterns present in this library. I see two problems both relating to the fallible nature of acquiring locks the data:

  1. It makes the API more "noisy". I'm not sure that many of these functions even can fail to acquire a lock since they're almost exclusively reads. But we still hoist the potential for those errors into the API, requiring the user to handle them. Or there are some places where we simply unwrap a lock result. While I think that those unwraps are unlikely to error I hope it illustrates the point.
  2. It makes implementations more verbose as one has consider the guard lifetimes.
    Neither of these are necessarily damning (well, the first more than the second, maybe) but my feeling that they are important is compounded by my perception that we aren't using the interior mutability aspect of a RwLock. All of the places that we acquire a write lock are behind a mutable reference already. It seems to me that we're exclusively using the reference counting aspect of Ptr and PtrList and not the interior mutability at all. As such, perhaps there is a more applicable abstraction.

I also have a sense that if things like the layer maps were modified from outside of the containing data structures there is significant potential for breakage. Or at least it's not clear to me that allowing changes to interior data apart from the API is desired.

Solutions

Replace std::sync::RwLock with parking_lot::RwLock

Pros: Low effort.
Cons: Doesn't really fix point 2 but would improve point 1.

Use a slotmap/petgraph

We're representing a directed graph. Petgraph is well used and loved for good reason. This would solve both problems and potentially allow us to leverage existing patterns for traversal and analysis.
Pros: The arena allows us to maintain cheap, potentially cyclical "references". Mutating only from behind a mutable reference creates patterns that users expect.
Cons: Access to the actual resource requires access to the arena. Functions like Layout::flatten would require a reference to the containing library, etc.

Replace Ptr with arc_swap::ArcSwap

Pros: We're mostly read-heavy so we get to keep our pointer semantics while getting ride of the locks & guards.
Cons: Our individual cells become copy-on-write by necessity.

There are probably other ways of doing this but I think these represent a good start.

My personal inclination would be towards a slotmap/petgraph but I think I would be easily convinced of ArcSwap as well.

@dan-fritchman
Copy link
Owner

Hey Tanner - thanks for weighing in here!
And I agree, it's debatable whether we need the thread-safety in particular.

The primary use-case for Ptr - the one it was designed for - is the Instance to Cell/ Layout "pointers". Then, layer on a few things so we can hash them etc.

Re: slotmap - that was actually the predecessor to Ptr in all the very same places!
And it was replaced for exactly the "con" you listed: the pain of needing a reference to the slotmap, everywhere one needs it refer to its members.

Re: ArcSwap - the "copy on write" does strike me as a really big con there. One can end up with really big layouts, and even worse, many subtly different copies of them, and a tough time tracking which references refer to which.

Re: parking_lot::RwLock - I know PL is popular, but TBH I don't know what their RwLock does better than the std-lib one. What would be the attraction there?

I would add a few more options - at least for the primary Instance / Cell use-case:

  • The standard library's Rc and RefCell, i.e. the single-threaded version.
  • Just... the regular old reference system. This would add a "cells cannot be modified once (while) Instances exist" rule. Which, we can decide whether to live with.

@TannerRogalsky
Copy link
Contributor Author

Using references would definitely resolve my outlined concerns but will mean that either the structure that owns the underlying data is either pinning that data or is unmovable itself while the references exist. Neither of those prospects are particularly appealing to me but I've never felt entirely comfortable making my own pinned data structures maybe that's something you or another contributor have more experience with.

Switching to Rc<RefCell<T>> is equivalent, in my mind, to switching to parking_lot::RwLock in the scope of what I'd like to improve here. Which is to say that I haven't done or seen benchmarks that point at the atomic reference counting or mutex locking being substantial performance bottlenecks. I would be surprised if that were case but, hey, I've been surprised by benchmarks before. But both of those options would improve the error handling ergonomics of the library as they both offer a panicking borrower function which I think is desirable in the context that we're considering.

While this appeals to me it isn't the main thing I want to improve. Basically, I don't see anywhere in this library that a call to std::sync::RwLock::write is made where the underlying data isn't behind a mutable reference. That tells me that, while we need the references we don't need the interior mutability. We need the Rc but not the RefCell and simplifying this down to the case that is actually used would improve the external and internal implementations.

Cases where we want references but don't need or want interior mutability screams "arena" to me. I have some confidence that we could make using an arena like slotmap or petgraph more performant, ergonomic and robust. But I should probably take a look at where you switched things out previously to better understand the decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants