-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TestUpdateDevices fails on i386 #4594
Comments
OK, thanks to @rata (who reproduced it locally) I did some more testing and was also able to reproduce this. Turns out it's quite reproducible, and the issue is, after some time of running This can happen in any JSON, not just the one sent when passing initConfig to a child. To catch it earlier, I added this check to utils.WriteJSON: diff --git a/libcontainer/utils/utils.go b/libcontainer/utils/utils.go
index db420ea6..58d00d38 100644
--- a/libcontainer/utils/utils.go
+++ b/libcontainer/utils/utils.go
@@ -1,7 +1,9 @@
package utils
import (
+ "bytes"
"encoding/json"
+ "fmt"
"io"
"os"
"path/filepath"
@@ -32,6 +34,10 @@ func WriteJSON(w io.Writer, v interface{}) error {
if err != nil {
return err
}
+ if bad := bytes.IndexByte(data, 0xff); bad != -1 {
+ excerpt := data[bad-16:bad+20]
+ panic(fmt.Errorf("WriteJSON: bad data at pos %d; excerpt: %s %v; original data: %+v", bad, excerpt, excerpt, v))
+ }
_, err = w.Write(data)
return err
} Then you only need to run TestUpdateDevicesSystemd: vagrant@vagrant:~/git/runc$ cd libcontainer/integration/
vagrant@vagrant:~/git/runc/libcontainer/integration$ GOARCH=386 CGO_ENABLED=1 go test -c .
vagrant@vagrant:~/git/runc/libcontainer/integration$ time sudo -E PATH=$PATH ./integration.test -test.run UpdateDevicesSystemd -test.count 20 and now it fails like this:
(and so on -- in different places). Also, runc v1.2.0 is fine, while the git main HEAD is buggy. git bisect points to commit e809db8, which is PR #4563. So, apparently, something in cilium/ebpf v0.17.0 messes up with Go reflection. |
Latest cilium/ebpf release (v0.17.2) is still buggy. Guess I have to bisect it now. |
Commit 78074c5 ("info: expose more prog jited info"), which made its way into v0.17.0, resulted in random runc CI failures on i386 (see [1]). In some cases it manifested in a panic or SIGSEGV, and in others we saw a slightly broken JSON, in which the first 4 bytes of a key were replaced with 0xff byte. Changing uintptr (which is 32 bit) back to uint64 fixes the issue for runc. It changes the public API but I see no way around it (and the uintptr cast of uint64 which was there before does not look correct either). Alas, I don't have a good reproducer, nor a unit test. For a rather complicated one, see [1]. [1]: opencontainers/runc#4594 Signed-off-by: Kir Kolyshkin <[email protected]>
It's cilium/ebpf#1598 (specifically, cilium/ebpf@a34ec46) which changed ksym from Proposed fix: cilium/ebpf#1660 |
Description
We have a CI job to compile runc and run its unit tests to make sure the code is 32-bit clean.
Recently (about a month ago)
TestUpdateDevices
and/orTestUpdateDevicesSystemd
start to fail randomly.The failure is happening in one of two ways:
invalid character 'ÿ' looking for beginning of object key string
Log
strings.Contains
Log
Both cases look like some kind of memory corruption.
Steps to reproduce the issue
Describe the results you received and expected
no failures
What version of runc are you using?
git HEAD
Host OS information
No response
Host kernel information
No response
The text was updated successfully, but these errors were encountered: