-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax /sys/dev/block restrictions for volumes and devices #13708
base: main
Are you sure you want to change the base?
Conversation
A friendly reminder that this PR had no activity for 30 days. |
Please rebase |
LGTM after rebase. |
it still leaks anon volumes. I've got a tmpfs based rework almost ready. Will push in the next few days. |
A friendly reminder that this PR had no activity for 30 days. |
Needs a rebase. or to be closed? |
Friendly ping @grooverdan |
I got a lot closer today https://github.com/grooverdan/podman/tree/expose-sys-dev-block-12746_v2 but progressing faster with a working dlv. I've got it generating:
Correctly, just need to complete the bind mounts in the startup. |
141ce9a
to
0c71250
Compare
Basic working:
|
libpod/runtime_ctr.go
Outdated
ctr.config.Spec.Mounts = ctr.config.Spec.Mounts[:len(ctr.config.Spec.Mounts)-1] | ||
} | ||
} | ||
ctr.state.BindMounts[sysDevBlock] = dir // ?????????????? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to removing comments here and formatting:
Is moving from a Spec
to a BindMount
here the right way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mheon PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not safe. It may work for the first time the container is run, but it will be cleared for subsequent runs. Bind mounts are regenerated each time the container is run - they're only meant to handle files Podman creates and manages itself. Why do you want to use them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bind mount is for the symlinks that Podman creates itself. This seemed like a good location for a few symlinks derived from the device and volume mounts passed from the user in a place which I assume is auto-cleaned up.
$ ls -la /run/user/1000/containers/overlay-containers/cc8ca97825f48662209987fc11c34ebbea6a7c7a5ce39dde9a17cff036794b5b/userdata/sysdevblock
total 0
drwx------. 2 dan dan 100 Jul 14 17:09 .
drwx------. 4 dan dan 220 Jul 14 17:09 ..
lrwxrwxrwx. 1 dan dan 32 Jul 14 17:09 253:3 -> ../../devices/virtual/block/dm-3
lrwxrwxrwx. 1 dan dan 33 Jul 14 17:09 7:1 -> ../../devices/virtual/block/loop1
lrwxrwxrwx. 1 dan dan 33 Jul 14 17:09 7:2 -> ../../devices/virtual/block/loop2
I'm effectively using the blkMnt
created in SpecGenToOCI
(where visibility of Mask/unmask options exists) as a communication mechanism (was there another better var available?) to here, setupContainer
, where ctr.state.RunDir
exists so I can create the above dir, and makeBindMounts
to mount it like resolv.conf/hosts etc (maybe c.bindMountRootFile
should have been used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, so this is a Podman-managed mount - this does make sense, then. You should move to makeBindMounts
, though. We do not guarantee stability of ctr.state.RunDir
(practically speaking it should not change, but that's not a guarantee) so we force regeneration of all bind mounts after a fresh reboot - doing this in makeBindMounts
accounts for that possibility, and ensures that your new symlinks aren't wiped from the DB on fresh boot.
7e415ea
to
cc20c2c
Compare
Basic golang question: Is adding to the runtime spec the way to preserve the state? |
The OCI spec is saved to the DB, so that's certainly a way to do things. Generally, anything in |
I don't suppose there's a way to get the CI here to run without pulling over the generated file While I prepare a OCI spec update PR (if that the right way to go still) can I get help with: How to resolve mips having a uint32 type for
And golangci-lint objecting to casting it to uin64.
Is there a way to construct this small bit of code to keep these two compilers happy? |
The implementation seems to be in Podman. Why do we need a change to the OCI runtime specs? |
cc20c2c
to
5b29214
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: grooverdan The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I'm trying to following @mheon's advice to store the state in |
User space programs want to access information about the block devices they are operating on. E.g. the block size is an important aspect if doing O_DIRECT filesystem calls. On the other hand, rhbz#1772993 wants to keep the host information as hidden from the container running processes as possible. We expose only the volumes and devices that are mounted into the container by re-generating the symlinks in /sys/dev/block for the block devices that have host based symlinks. These are generated on ctr.state.RunDir/sysdevblock as a mountpoint and mounted ro into the container. The default visibility can changed by the user with --security-opt={u,}mask=/sys/dev/block Consolidate the libpod.mountBind implementation. Closes #12746 Signed-off-by: Daniel Black <[email protected]>
5b29214
to
190b6e2
Compare
Signed-off-by: Daniel Black <[email protected]>
190b6e2
to
f8f8276
Compare
@grooverdan: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@grooverdan are you still working on this? |
I got really stuck on the checkpoint/restore and how it was meant to be implemented. If you feel like taking over you have my heart felt gratitude, otherwise I do hope to return to it within 6 months. |
could containers/common#2278 replace this one? |
User space programs want to access information about the block
devices they are operating on. E.g. the block size is an important
aspect if doing O_DIRECT filesystem calls.
On the other hand, rhbz#1772993 wants to keep the host information
as hidden from the container running processes as possible.
We expose only the volumes and devices that are mounted into the
container by re-generating the symlinks in /sys/dev/block using
temporary volume.
Closes containers/common#2277
First POC for review. Excuse any poor golang style, its been years since I touched it.
TODO: