-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clone returned byte slice in MarshalValue #1913
base: master
Are you sure you want to change the base?
Conversation
API Change ReportNo changes found! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ragnaroek thank you for the PR! That is not expected behavior and is indeed a bug. I've confirmed this is the root cause behind another intermittent test failure we've been troubleshooting (GODRIVER-3307).
bson/marshal.go
Outdated
@@ -133,7 +134,7 @@ func MarshalValue(val interface{}) (Type, []byte, error) { | |||
return 0, nil, err | |||
} | |||
typ := sw.Next(2) | |||
return Type(typ[0]), sw.Bytes(), nil | |||
return Type(typ[0]), slices.Clone(sw.Bytes()), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Go Driver currently supports a min Go version of 1.18, which does not have a "slices"
package (it was added in Go 1.21). Instead, use the same clone logic from Marshal.
return Type(typ[0]), slices.Clone(sw.Bytes()), nil | |
buf := append([]byte(nil), sw.Bytes()...) | |
return Type(typ[0]), buf, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, already using Go 1.23. I updated the PR to copy in a pre Go 1.21 way.
func TestSharedUseOfMarshalledBytes(t *testing.T) { | ||
type testValue struct { | ||
value string | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unable to reproduce the buffer reuse bug with this test. The value
field is not exported, so the test cases seem to always marshal an empty document. How is this test intended to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also failed to write a reproducible test for this. The test also does not fail if the Value field is made public.
To reproduced this a shared used of the byte buffer has to be provoked. As this is highly dependent on the concurrent behaviour of the byte pool, I'm not sure if this can be achieved in a stable way.
In our production code the test always failed. It was not shaky or something. Unfortunately the production code is proprietary and I cannot share it. The test in the PR tried to re-produce a minimal example of the failing test, but I failed doing so. The gist of it was that the reference to the (shared) byte buffer was stored in the 'sharedTestCase' struct and was modified by a second test. If the test were run in isolation they did not fail, only with two tests together the shared issue did surface.
If you also have no smart idea on how to reproduce this "shared buffer issue" I would remove the test and leave the PR with just the copy of the byte buffer?
No issue number yet, as for now unsure if this is intended behaviour or a bug.
Summary
This PR is not ready for merge. It's more to start a discussion around a behaviour I noticed while upgrading to v2 of the driver.
After the v2 driver upgrade unit-tests failed in our project. I took me a long time to find the issue, but I think the root cause is that 'MarshalValue' hands out byte slices that are potentially shared with the Buffers from the pool. This can lead to nasty race-conditions where the byte slices are modified in our test-setup.
I tried to reproduce the behaviour in a unit-test. But this somehow did not work. However the unit-test is a stripped down version of our unit-test in the project and should give you an idea how this leads to a problem with our tests. I can detail this further of course, if it helps to understand the issue.
The fix is easy in the end. If the byte slice is copied, the problem stops. I can do this on the usage side, but I think this shared behaviour of the returned bytes is very unintuitives and error prone in the edge-cases?
The PR proposes a copy of the bytes before returning them. Not sure that is something that is aligned with goals of the library function.
Background & Motivation
This PR fixes a very unintuitive and hard bug to find while "only" upgrading to v2 of the MongoDB driver. I assume other users of the library could also experience this and a proposal for a fix is up for a discussion.