Rework how blame is passed to parents #1751
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Before this PR, blame would have been passed to the first parent in some of the cases where there was more than one.
This PR changes that by only removing a particular suspect from further consideration once it has been compared to all of its parents. For hunks where blame cannot be passed to any parent, we can safely assume that they were introduced by a particular suspect, so we remove those hunks from
hunks_to_blame
and create aBlameEntry
out of them.We can illustrate the change using the following small example history:
Let’s now say that we’re blaming a file that has the following content:
The resulting blame should look like this:
The previous version of the algorithm would have passed blame to just (2) or (3), depending on which one came first in the list of parents.
I discovered this discrepancy while working on creating a test for #1743.
Performance-wise there doesn’t seem to be a significant difference to
main
. Where there is, I think it is due to the implementation now being more correct. I haven’t dug really deep into benchmarking the changes, though.It is possible that there’s still cases that this version of the algorithm does not get right, but I think it is a step in the right direction. Let me know what you think!
Open question
Do we prefer
new_unblamed_hunk(…)
orUnblamedHunk { … }
in tests?Related to that: there’s a couple of comments in tests that mention
range_in_destination
although, at this point, this is just the name of a local variable innew_unblamed_hunk
(it used to be the name of one ofUnblamedHunk
’s fields).I can certainly make this more consistent, but am slightly undecided as to whether it is worth the effort.