Take measurement inaccuracy intervals into account #4078

marlamb · 2024-10-21T05:01:56Z

Stating one version being faster while the intervals have a considerable overlap is probably not accurate.

ghost · 2024-10-25T13:28:00Z

I agree. Saying that the performance of the 'iter' version, in this specific case, is superior (even if slightly) reveals more bias than information.
If, by chance, the benchmark result indicated that the 'for' version was faster (in this specific case), would the author try again until 'iter' was the fastest?

chriskrycho

Thanks for the suggestion. It’s true that it was “within the noise” but the point—as the rest of the paragraph itself says!—is not to prove something about the two versions, but to get a general sense of how they compare performance-wise. As measured, the iterator version was slightly faster—and that’s all the text was saying!

If you want to dig into the details further:

The lower bound for the iter version is 18,577,700 and the lower bound for the for loop version is 18,704,600.
The upper bound for the iter version is 19,892,100 and the upper bound for the for loop version is 20,536,000.

You’d see this clearly on a violin plot or similar: we may not be 100% positive that the iterator is always guaranteed to be faster, but it’s totally fair to say that it was faster in this case, even when digging into those details!

marlamb · 2024-10-26T09:32:30Z

Thanks for the suggestion. It’s true that it was “within the noise” but the point—as the rest of the paragraph itself says!—is not to prove something about the two versions, but to get a general sense of how they compare performance-wise. As measured, the iterator version was slightly faster—and that’s all the text was saying!

If you want to dig into the details further:
* The lower bound for the iter version is `18,577,700` and the lower bound for the for loop version is `18,704,600`.

* The upper bound for the iter version is `19,892,100` and the upper bound for the for loop version is `20,536,000`.
You’d see this clearly on a violin plot or similar: we may not be 100% positive that the iterator is always guaranteed to be faster, but it’s totally fair to say that it was faster in this case, even when digging into those details!

I am sorry, but I strongly disagree. From a scientific point of view stating one of them is faster if the confidence intervals overlap that strongly is simply wrong (and assuming that it is only a one-sigma interval makes it even worse).

The change I made tried to emphasize that they are of equal performance, which does not contradict the complete story the section wants to tell -> use the higher level construct, you don't pay for it performance-wise. I don't want to say my formulation is perfect and I am open for suggestions to further improve it. But keeping it as is, is not supporting the intention: everyone with scientific background should immediately recognize that the author tries to make an argument based on data, which the data do not support.

I also don't know about the usual manners in this repository, as I am contributing for the first time. But I have to admit I am slightly surprised that this gets closed without proper discussion, especially as already someone independent of the author (@odinplusplus) agreed that the general idea of the change seems ok.
@chriskrycho it would be nice if you could reply to these questions and I would appreciate reopening the pull request to perhaps get also some feedback from others.

chriskrycho · 2024-10-30T17:03:13Z

First, as regards “manners”: if every time two commenters happened to agree on something we were obliged to make a change, we would be all over the place, including changing things back and forth on the same text over time. The same for “proper discussion”—there is a basic imbalance between maintainer time and the time of folks submitting suggestions. We regularly just have to make judgment calls and move on, or else we’d spend all our time just responding to issues, PRs, etc.! I hope that helps make some sense of the relatively quick and brief response.

However, it’s important to recognize that the text isn’t really “making an argument” here or aiming to be precise; it’s just reporting in a fairly casual way what the benchmark indicated. (I'm very well aware of how confidence intervals work, and yes, I agree that it’s entirely possible they’re actually “the same speed” given those error bars!) The point in the text—as in my comment above—is just to say that if we’re eyeballing it/taking the benchmark at face value, one appears to be a bit faster than the other, and it might surprise folks who assume that iterators are always slower than a hand-written loop.

Net, I don’t object to the change you suggested to the text, and I’ll consider reopening it (I want to mull on it a bit); I am just not persuaded it’s absolutely necessary, given what the text is actually trying to do here. Hope that makes sense!

dyfrgi · 2025-01-02T03:03:50Z

I came by to open an issue about this sentence. The effect of this line was to make me trust the performance assertions made by the text less. If the text is making this subtly wrong assertion about performance, what other wrong assertions is it making?

I see the point that this isn't really a page which is intended as a thorough exploration of Rust performance - explaining how to write fast programs isn't a goal of the book at all. The point it's trying to make is that you shouldn't worry about the performance of these two constructs when deciding which to use. I think that saying that they're the same is just as effective for that purpose.

I think the small wording change in this PR is a good one. You could rework things more, but this small change is a pure improvement.

chriskrycho · 2025-01-06T17:35:36Z

After a bit of mulling, I have reopened this and am merging it. Thanks!

Take measurement inaccuracy intervals into account

74976a1

Stating one version being faster while the intervals have a considerable overlap is probably not accurate.

chriskrycho reviewed Oct 26, 2024

View reviewed changes

chriskrycho closed this Oct 26, 2024

chriskrycho reopened this Jan 6, 2025

chriskrycho merged commit 3d058ca into rust-lang:main Jan 6, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take measurement inaccuracy intervals into account #4078

Take measurement inaccuracy intervals into account #4078

marlamb commented Oct 21, 2024

ghost commented Oct 25, 2024

chriskrycho left a comment

marlamb commented Oct 26, 2024 •

edited

Loading

chriskrycho commented Oct 30, 2024

dyfrgi commented Jan 2, 2025

chriskrycho commented Jan 6, 2025

Take measurement inaccuracy intervals into account #4078

Take measurement inaccuracy intervals into account #4078

Conversation

marlamb commented Oct 21, 2024

ghost commented Oct 25, 2024

chriskrycho left a comment

Choose a reason for hiding this comment

marlamb commented Oct 26, 2024 • edited Loading

chriskrycho commented Oct 30, 2024

dyfrgi commented Jan 2, 2025

chriskrycho commented Jan 6, 2025

marlamb commented Oct 26, 2024 •

edited

Loading