-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[podman] conmon should restart dead child processes #277
Comments
I think you'd lose active connections, but you'd lose those on the container going down too. I don't really think you can do the restart straight from Conmon, though. We really need access to the full container definition from the Podman DB to proceed. |
Sure, but without a heartbeat or another check, a user wouldn't get informed about this. Can't we get a log entry at least? |
Log entry is definitely viable. Container being killed is also viable. We could probably do slirp restart, but it'd require a fair bit of hacking - we'd need to be able to pass in a command for Conmon to run on slirp exit that is different and distinct from the exit command. |
With a normal systemd-setup, a gracefully killed container would restart, so would slirp. Sounds well :-) |
I don't think conmon should know about slirp4netns. IMO, slirp4netns should be seen as infrastructure for the container. Killing slirp4netns is equivalent to dropping the iptables rules for root containers or killing fuse-overlayfs when it is used for rootless. |
Oops, I forgot to add fuse-overlayfs in my post. Killing, of course, was an edge case, of course. I just wanted to simulate: When happens, if slirp4netns or fuse-overlayfs crash by itself. Will the container heal itself, will there be logs, etc. So it's even fine when the container gets stopped (or restarted). But an entry in the logs would be fine, so the admin could react. |
we could move slirp4netns to a separate cgroup (or at least make it configurable) so that systemd could report the failure. I'd not worry about fuse-overlayfs since we are moving to use the native overlay support for rootless as well. |
I would love to see conmon kill the container if slirp4netns and/or fuse-overlayfs exited and exit with an error state Then it would be up to podman or systemd to decide if the pod/container should restart. Could we potentially do this by passing pidfds to conmon, and then having conmon wait on those pids, if they exit, then conmon throws an error. |
I like the pidfd idea a lot. |
What's the issue?
When killing
slirp4netns
, a pod or a container keeps on running without warnings, but without networking.How to reproduce?
podman pod create --name systemd-pod podman create --pod systemd-pod alpine top podman create --pod systemd-pod alpine top podman pod start systemd-pod pkill -U tobwen 'slirp4netns'
What's expected?
What's the environment?
podman
version 3.3.0-dev
conmon
version 2.0.30-dev
The text was updated successfully, but these errors were encountered: