-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass an iterator to add_incompatibility_from_dependencies
#226
Pass an iterator to add_incompatibility_from_dependencies
#226
Conversation
Previously, `Dependencies::Available` was hardcoded to a hash map. By relaxing this to `impl IntoIterator<_> + Clone`, we avoid converting from uv's ordered `Vec<(_, _)>` to a `HashMap<_, _>` and avoid cloning. ## Design considerations We implement this using the return type `Dependencies<impl IntoIterator<Item = (DP::P, DP::VS)> + Clone, DP::M>`. This allows using `get_dependencies` without knowing the actual (which a generic bound would require) or adding another associated type to the dependency provider. The `impl` bound also captures all input lifetimes, keeping `p` and `v` borrowed. We could bind the iterator to only `&self` instead, but this would only move two clones (package and version) to the implementer. Co-authored-by: Jacob Finkelman <[email protected]> Co-authored-by: Zanie <[email protected]> Co-authored-by: Charlie Marsh <[email protected]>
0fba094
to
ab20eb2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been indecisive on this work for a long time.
Pros:
- It can be used to reduce some allocations, but only with odd constructs and a lot of attention. I should rerun benchmarks to see how much difference this makes.
- It allows more accurate modeling of multiple dependency edges on the same package. Which in turn leads to better error messages for this already confusing situation. (I believe this is functionality we NEED to support for UV.)
Cons:
- Is a significantly more complicated signature. This is a large next step toward
DependencyProvider
becoming an advanced user only interface. Although that may be an inevitable consequence of having production users. - It is not obvious that this is the "right" abstraction for the situation. Would something else be more ergonomic for users? Would something else be more compatible with future improvements iterator?
The worst that can happen is we decide it's the wrong abstraction after releasing 0.3
and have to change it for 0.4
. So I'm leaning toward accepting this change. But I like to hear some other people's thoughts and arguments, especially @mpizenberg about the ergonomics.
src/solver.rs
Outdated
Ok(match self.dependencies(package, version) { | ||
None => { | ||
Dependencies::Unavailable("its dependencies could not be determined".to_string()) | ||
} | ||
Some(dependencies) => Dependencies::Available(dependencies), | ||
Some(dependencies) => Dependencies::Available(dependencies.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The allocation in this clone can be removed by making it Dependencies::Available(dependencies.iter().map(|(p, vs)| (p.clone(), vs.clone())))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx i accidentally worsened this because i missed that this isn't impl IntoIterator<_>
yet.
return Ok(match dependencies { | ||
Dependencies::Unavailable(reason) => Dependencies::Unavailable(reason), | ||
Dependencies::Available(available) => Dependencies::Available( | ||
available.into_iter().collect::<Vec<(Self::P, Self::VS)>>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocations in these collect calls make me sad. This cache hit case is exactly where I was hoping to avoid allocations by accepting an iterator. My most recent attempt got a little closer to something elegant, but with its own set of trade-offs.
Fundamentally were hitting https://smallcultfollowing.com/babysteps/blog/2018/09/02/rust-pattern-iterating-an-over-a-rc-vec-t/ I think the unstable gen
blocks may be an ergonomic fix for this, but I have not succeeded at trying it in my own projects.
We can also keep the If it's relevant, i can do some benchmarking with this changed and with it rolled back in uv.
I tried to abstract it into a trait and other tricks, but (my) rust is too limited to figure out something better. |
I've reverted the biggest change to the dependency provider interface, we can also merge |
Having ask you to make this PR and then had second thoughts while reviewing it with the second thoughts being abstract "does there exist" questions I owed you my full time and attention today. I went to a lot of harebrained ideas, most of which obviously wouldn't work without even looking at an editor. Eventually I had the idea of a What are people's thoughts? Should I open a PR for us to discuss that proposal? |
Isn’t the fact that this struct holds a mut ref to the state going to be super annoying for implementations with async concurrency? Or is this structure always only short lived? (Sorry if my question doesn’t make sense) Waiting on Konstin feedback on this. Otherwise, I don’t remember exactly why we did not use |
The |
It is short lived. So I don't expect trouble. But I don't maintain a async api, so not 100%. If it makes problems for UV, they can use
|
I think the lib is already too complex to be used without looking at docs. If we want people making useful things with pubgrub they should implement quite a lot themselve already so I don’t think this change would add too much barrier to entry. |
I'm having a lot of thoughts, but I'm not sure what they are. I'm going to start writing in the hope that I can figure it out by the time I get to the end. If this ends up being incoherent I'm sorry. There bunch of things I want the
So let's start by evaluate the existing API:
Next let's look at the original proposal from this PR.
What if we add a new Enum for "future-proof". Basically combining this PR with #124. Something like
Having two different Enum is really stretching our simplicity budget. Perhaps we can combine them? Something like
A note on the Now let's evaluate my harebrained idea from yesterday, add_deps_take_2.
So I've gotten to the end, and I still don't know what I was trying to say. It seems to have turned into a brainstorming about #148. One take away is that eventually we will probably grow a |
I’d think the same, so as long as it’s possible to rewrite a v0.2-like api from v0.3 I don’t think your point (2) will be an issue.
I remember we discussed a lot about this. My point of view is that we should base this type of changes on the theory (incompatibilities). I think this type of change, enabling dependencies to provide more fine-grain type of incompatibilities, instead of just the shape {a, not b} should be approached holistically, not iteratively with enums variants appended one after the other. But also, I’m also not worried about compatibility, because just like it’s possible to make v0.2-like wrappers when v0.3 is released, it can be exactly the same for v0.4 because this is a strict superset of capabilities. |
I don't quite follow. Trying to figure out what you had in mind. If we want to provide direct access to the solver, that will probably involve public access to "state" which is probably worth doing but its own headache. Even if we did that, we would still want |
What I mean by "holistically" is that all possibilities should be evaluated and added to the api at the same time. There is a finite number of incompatibility types that can be considered. What I’d like to avoid is saying ok now we enable {a, b} and {a, -b}, let’s see in the future if we need something else. Instead, it would make sense to just verify that with the past two years of optimizations we didn’t break a fundamental property that would prevent pubgrub from working with any incompatibility provided. And if ok, add these apis in a cohesive fashion. Let me know if I’ve expressed my opinion better this time. |
add_incompatibility_from_dependencies
to avoid cloning
That makes a lot of sense. Let's get a full list of reasonable possibilities and name/document what they might be useful for before adding flexible APIs. In office hours @konstin convinced me that avoiding the hash map construction is only an important performance optimization when the index data has been prefetched. If the data actually needs to be loaded from somewhere (whether that's network or disk) reading the data will be so much more expensive then constructing the map to make the optimization irrelevant. Ironically, the only place the optimization is visible is while doing benchmarking. UV also rools its own resolve loop, and does not use dependency provider. so we agreed that the best next step was to change the API for And while I was writing that comment it looks like the new revision is in. |
add_incompatibility_from_dependencies
to avoid cloningadd_incompatibility_from_dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently a very small internal change, but it helps UV.
55a73e4
to
08549f3
Compare
We had previously changed the signature of `DependencyProvider::get_dependencies` to return an iterator instead of a hashmap to avoid the conversion cost from our dependencies `Vec` to the pubgrub's hashmap. These changes are difficult to make in pubgrub since they complicate the public api. But we don't actually use `DependencyProvider::get_dependencies`, so we rolled those customizations back in pubgrub-rs/pubgrub#226 and instead opted to change only the internal `add_incompatibility_from_dependencies` method that we exposed in our fork. This aligns us closer with upstream, removes the design questions about `DependencyProvider` from our concerns and reduces our diff (not counting the github action) to +36 -12.
We had previously changed the signature of `DependencyProvider::get_dependencies` to return an iterator instead of a hashmap to avoid the conversion cost from our dependencies `Vec` to the pubgrub's hashmap. These changes are difficult to make in pubgrub since they complicate the public api. But we don't actually use `DependencyProvider::get_dependencies`, so we rolled those customizations back in pubgrub-rs/pubgrub#226 and instead opted to change only the internal `add_incompatibility_from_dependencies` method that we exposed in our fork. This aligns us closer with upstream, removes the design questions about `DependencyProvider` from our concerns and reduces our diff (not counting the github action) to +36 -12.
To summarize, in the initial version of this PR we were changing Personally, i don't think the current signature of |
Previously,
add_incompatibility_from_dependencies
was hardcoded to take a hash map. By relaxing this toimpl IntoIterator<_>
, we avoid converting from uv's orderedVec<(_, _)>
to aHashMap<_, _>
and avoid cloning.This intentionally does not change the public api of the
DependencyProvider
, only the internal interface that we are using in uv.