-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dedup
, dedup_by
and dedup_by_key
to the Iterator
trait
#83748
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @joshtriplett (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
This comment has been minimized.
This comment has been minimized.
Finally got the CI to stop complaining 🎉 |
This comment has been minimized.
This comment has been minimized.
type Item = T; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if self.last.is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to check this on every iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you're right, this is only strictly necessary for the first iteration. I can move this into the constructor.
} | ||
|
||
let last_item = self.last.as_ref()?; | ||
let mut next = loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the iterator is a iter::repeat(1)
then this will loop forever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this does not terminate for infinite repeating items
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a way for me to fix that without solving the halting problem, so I'll add a warning to the documentation instead
Hi, thanks for sending in your first pull request. \o/ Just wondering, if an |
Why do we need all three of the structs |
/// [`Iterator`]: trait.Iterator.html | ||
#[unstable(feature = "iter_dedup", reason = "recently added", issue = "83748")] | ||
#[derive(Debug, Clone, Copy)] | ||
pub struct Dedup<I, T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also implement SourceIter
and InPlaceIterable
for these?
/// This `struct` is created by the [`dedup`] method on [`Iterator`]. See its | ||
/// documentation for more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// This `struct` is created by the [`dedup`] method on [`Iterator`]. See its | |
/// documentation for more. | |
/// This `struct` is created by [`Iterator::dedup`]. See its documentation | |
/// for more. |
@voidc The problem is that I need a way to express the unique ZST of the closure I'm passing into the struct within the signature of the function which I wasn't able to do. If the signature is fn dedup<F>(self) -> DedupBy<Self, F, Self::Item> then I can't create a closure from within the function that maches the user defined type F. |
/// This `struct` is created by the [`dedup_by`] method on [`Iterator`]. | ||
/// See its documentation for more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// This `struct` is created by the [`dedup_by`] method on [`Iterator`]. | |
/// See its documentation for more. | |
/// This `struct` is created by [`Iterator::dedup_by`] or [`Iterator::dedup_by_key`]. | |
/// See its documentation for more. |
Like @voidc mentioned, the fields are even the same so I think they can be merged together.
F: FnMut(&Self::Item) -> K, | ||
K: PartialEq, | ||
{ | ||
DedupByKey::new(self, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DedupByKey::new(self, key) | |
self.dedup_by(|a, b| key(a) == key(b)) |
https://doc.rust-lang.org/stable/src/alloc/vec/mod.rs.html#1441
self.dedup_by(|a, b| key(a) == key(b))
Not sure if this will work though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After that change, what is the signature of the function dedup_by_key
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the C++ way to give your closure type a name:
struct EqByKey<F> {
f: F
}
impl<I, K: PartialEq, F: FnMut(&I) -> K> FnOnce<(&I, &I)> for EqByKey<F> {
type Output = bool;
extern "rust-call" fn call_once(mut self, (a, b): (&I, &I)) -> bool {
(self.f)(a) == (self.f)(b)
}
}
impl<I, K: PartialEq, F: FnMut(&I) -> K> FnMut<(&I, &I)> for EqByKey<F> {
extern "rust-call" fn call_mut(&mut self, (a, b): (&I, &I)) -> bool {
(self.f)(a) == (self.f)(b)
}
}
Edit: just saw #83748 (comment)
I'm not sure I understand what you mean. There isn't really a good way to know whether or not an Iterator is going to contain duplicates or not in advance, so we always have to check for duplicates in every iteration. |
Just wondering, how will the performance of this be compared to |
If the iterator came from a By the way, |
Could something like that work? fn dedup(self) -> DedupBy<Self, impl FnMut(&T, &T) -> bool, Self::Item> or probably fn dedup(self) -> DedupBy<Self, impl for<'a> FnMut(&'a T, &'a T) -> bool, Self::Item> |
I tried fn dedup(self) -> DedupBy<Self, impl FnMut(&Self::Item, &Self::Item) -> bool, Self::Item>
where Self::Item: PartialEq {
self.dedup_by(|a, b| a == b)
} and I got
As I understand there is no way to express this type in Rust today. There might be ways to simplify these structs using macros, but the current version is the best I could come up with. |
} | ||
|
||
fn size_hint(&self) -> (usize, Option<usize>) { | ||
(0, self.inner.size_hint().1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, lower bound should be
self.last().as_ref().map(|_|1).unwrap_or(0)
This is how you could solve it without having three distinct types: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7a207cd41e8c9dd691a15a6c721c1678 |
@@ -50,7 +50,9 @@ where | |||
} | |||
|
|||
fn size_hint(&self) -> (usize, Option<usize>) { | |||
(0, self.inner.size_hint().1) | |||
let min = self.last.as_ref().map(|_| 1).unwrap_or(0); | |||
let max = self.inner.size_hint().1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add lower bound to size_hint, probably.
You now can end with situation (1, Some(0))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, you're right
let min = self.last.as_ref().map(|_| 1).unwrap_or(0); | ||
let max = self.inner.size_hint().1; | ||
(min, max) | ||
if self.last.is_some() { (1, self.inner.size_hint().1) } else { (0, Some(0)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it doesn't fix my last comment.
You can have self.last.is_some()
true and self.inner.size_hint().1
returning Some(0)
which result in (1, Some(0)).
I think, you should use something like your previous code:
let from_stored = self.last.as_ref().map(|_| 1).unwrap_or(0);
let inner_upper = self.inner.size_hint().1;
(from_stored, inner_upper.map(move|x|x+from_stored))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does deduplicate a lot of the code, but I'm not sure we want to add a new trait to the std library for this feature |
I could imagine the trait to be useful outside this specific feature. It could live in Here is an implementation with out an additional trait: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=79f50cced999414ea1dae9c5f4e0364d |
…ing `ByPartialEq`
Co-authored-by: Cameron Steffen <[email protected]>
Co-authored-by: Anders Kaseorg <[email protected]>
e6edf5c
to
a822aad
Compare
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
Didn't have time to test locally, I'll fix things tomorrow if the CI fails |
This comment has been minimized.
This comment has been minimized.
I wouldn't call the "named closures" approach a hack. On the contrary, this pattern is also used in the implementation of |
A separate trait (sealed and/or private) seems like a reasonable alternative; the specific problem is the use of direct impls of the Fn traits. |
Wait how the hell did this ICE? |
The job Click to see the possible cause of the failure (guessed by this bot)
|
@slerpyyy any updates on resolving the ICE/CI failure? |
@Dylan-DPC The ICE seemed to be caused by interaction with the The CI is currently failing due to an |
@slerpyyy , does this patch solve this problem with partition_dedup_by_key: #54279 (comment) ? |
Ping from triage: I'm closing this due to inactivity, - last touched in October 2022 Please reopen when you are ready to continue with this. @rustbot label: +S-inactive |
Tracking issue: #83747