-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take measurement inaccuracy intervals into account #4078
Take measurement inaccuracy intervals into account #4078
Conversation
Stating one version being faster while the intervals have a considerable overlap is probably not accurate.
I agree. Saying that the performance of the 'iter' version, in this specific case, is superior (even if slightly) reveals more bias than information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. It’s true that it was “within the noise” but the point—as the rest of the paragraph itself says!—is not to prove something about the two versions, but to get a general sense of how they compare performance-wise. As measured, the iterator version was slightly faster—and that’s all the text was saying!
If you want to dig into the details further:
- The lower bound for the iter version is
18,577,700
and the lower bound for the for loop version is18,704,600
. - The upper bound for the iter version is
19,892,100
and the upper bound for the for loop version is20,536,000
.
You’d see this clearly on a violin plot or similar: we may not be 100% positive that the iterator is always guaranteed to be faster, but it’s totally fair to say that it was faster in this case, even when digging into those details!
I am sorry, but I strongly disagree. From a scientific point of view stating one of them is faster if the confidence intervals overlap that strongly is simply wrong (and assuming that it is only a one-sigma interval makes it even worse). The change I made tried to emphasize that they are of equal performance, which does not contradict the complete story the section wants to tell -> use the higher level construct, you don't pay for it performance-wise. I don't want to say my formulation is perfect and I am open for suggestions to further improve it. But keeping it as is, is not supporting the intention: everyone with scientific background should immediately recognize that the author tries to make an argument based on data, which the data do not support. I also don't know about the usual manners in this repository, as I am contributing for the first time. But I have to admit I am slightly surprised that this gets closed without proper discussion, especially as already someone independent of the author (@odinplusplus) agreed that the general idea of the change seems ok. |
First, as regards “manners”: if every time two commenters happened to agree on something we were obliged to make a change, we would be all over the place, including changing things back and forth on the same text over time. The same for “proper discussion”—there is a basic imbalance between maintainer time and the time of folks submitting suggestions. We regularly just have to make judgment calls and move on, or else we’d spend all our time just responding to issues, PRs, etc.! I hope that helps make some sense of the relatively quick and brief response. However, it’s important to recognize that the text isn’t really “making an argument” here or aiming to be precise; it’s just reporting in a fairly casual way what the benchmark indicated. (I'm very well aware of how confidence intervals work, and yes, I agree that it’s entirely possible they’re actually “the same speed” given those error bars!) The point in the text—as in my comment above—is just to say that if we’re eyeballing it/taking the benchmark at face value, one appears to be a bit faster than the other, and it might surprise folks who assume that iterators are always slower than a hand-written loop. Net, I don’t object to the change you suggested to the text, and I’ll consider reopening it (I want to mull on it a bit); I am just not persuaded it’s absolutely necessary, given what the text is actually trying to do here. Hope that makes sense! |
I came by to open an issue about this sentence. The effect of this line was to make me trust the performance assertions made by the text less. If the text is making this subtly wrong assertion about performance, what other wrong assertions is it making? I see the point that this isn't really a page which is intended as a thorough exploration of Rust performance - explaining how to write fast programs isn't a goal of the book at all. The point it's trying to make is that you shouldn't worry about the performance of these two constructs when deciding which to use. I think that saying that they're the same is just as effective for that purpose. I think the small wording change in this PR is a good one. You could rework things more, but this small change is a pure improvement. |
After a bit of mulling, I have reopened this and am merging it. Thanks! |
Stating one version being faster while the intervals have a considerable overlap is probably not accurate.