Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using CRIU with nested LXC containers #2426

Open
alexfrolov opened this issue Jun 28, 2024 · 5 comments
Open

Using CRIU with nested LXC containers #2426

alexfrolov opened this issue Jun 28, 2024 · 5 comments

Comments

@alexfrolov
Copy link

Hi!

I am considering using CRIU for checkpoint/restore of nested LXC containers. Do I understand it right that in this case CRIU should be called inside parent container?

For example, as far as I understood CRIU is using mnt namespace of the CRIU process (/proc/self/ns/mnt) to resolve external mountpoints for target process, so this means that CRIU wont be able to work from the host level in case target process is running in nested container. Is that correct?

Thanks,
Alex

@alexfrolov alexfrolov changed the title Using CRIU with nesting LXC containers Using CRIU with nested LXC containers Jun 28, 2024
@avagin
Copy link
Member

avagin commented Jun 28, 2024

For LXC, the auto mode for external mounts will not work, you need to enumerate them manually. But you still need to call CRIU from a parent container to dump a target pid namespace properly. Otherwise, it will look like two nested pid namespaces.

I don't recommend to use CRIU directly to dump/restore LXC containers. It should be easier to use lxc tools for that.

@alexfrolov
Copy link
Author

Just to clarify about auto-detection of external mounts.

When I am running non-nested LXC container, it uses external mp at least for rootfs (besides, /proc stuff which extensivly uses fuse.lxcfs). For example, for Xenial-based container /proc/1/mountinfo looks like this:

root@u3:/home/ubuntu# cat /proc/1/mountinfo | grep master
690 619 253:1 /u3/rootfs / rw,relatime master:322 - ext4 /dev/mapper/dummy--vg-dummy--lv rw
696 697 0:34 / /sys/fs/fuse/connections rw,nosuid,nodev,noexec,relatime master:18 - fusectl fusectl rw
700 692 0:42 /proc/cpuinfo /proc/cpuinfo rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
701 692 0:42 /proc/diskstats /proc/diskstats rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
702 692 0:42 /proc/loadavg /proc/loadavg rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
703 692 0:42 /proc/meminfo /proc/meminfo rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
761 692 0:42 /proc/slabinfo /proc/slabinfo rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
762 692 0:42 /proc/stat /proc/stat rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
763 692 0:42 /proc/swaps /proc/swaps rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
909 692 0:42 /proc/uptime /proc/uptime rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
910 697 0:42 /sys/devices/system/cpu /sys/devices/system/cpu rw,nosuid,nodev,relatime master:154 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other

So, this container can be easily dumped by the following command (APP_PID is the PID of /sbin/init in the host's pid ns):

sudo /home/ubuntu/criu/criu/criu dump --tcp-established --file-locks --link-remap --manage-cgroups=full --ext-mount-map auto --enable-external-sharing --enable-external-masters --enable-fs hugetlbfs --enable-fs tracefs -D /tmp/checkpobint-u3 -o /tmp/checkpobint-u3/dump.log --cgroup-root cpuset,cpu,io,memory,hugetlb,pids,rdma,misc:lxc.payload.u3 -v4 --ext-mount-map /sys/fs/fuse/connections:sys/fs/fuse/connections -t $APP_PID --skip-in-flight --freeze-cgroup /sys/fs/cgroup/lxc.payload.u3 --force-irmap

So AFAIU, the option --ext-mount-map auto looks pretty usable for LXC containers in shared parent mountpoints or I'm missing something? BTW, what is manual enumeration of the mountpoints?

I agree that direct calling to criu for c/r of LXC containers not the best thing to do, but I want to understand how things work here...

Thank you!

@avagin
Copy link
Member

avagin commented Jul 2, 2024

So AFAIU, the option --ext-mount-map auto looks pretty usable for LXC containers in shared parent mountpoints or I'm missing something? BTW, what is manual enumeration of the mountpoints?

It works well before you need to restore it. I don't remember details it was a long time when I used it last time.

I agree that direct calling to criu for c/r of LXC containers not the best thing to do, but I want to understand how things work here...

When LXC starts a container, it creates a mount namespace for it and mounts rootfs and a few other mounts that depends on a container configuration (fusefs proc, external volumes, etc). For CRIU, all these mounts will be external. Only LXC knows how to proper mount them on restore. I don't know where this code in the LXC, but you can look at runsc, it should be similar:
https://github.com/opencontainers/runc/blob/main/libcontainer/criu_linux.go#L108
https://github.com/opencontainers/runc/blob/main/libcontainer/criu_linux.go#L490

@rst0git
Copy link
Member

rst0git commented Jul 4, 2024

So AFAIU, the option --ext-mount-map auto looks pretty usable for LXC containers in shared parent mountpoints or I'm missing something? BTW, what is manual enumeration of the mountpoints?

It works well before you need to restore it. I don't remember details it was a long time when I used it last time.

If I recall correctly, this option may have "undefined" behaviour when migrating a container to another host (opencontainers/runc#1968 (comment)).

Copy link

github-actions bot commented Aug 4, 2024

A friendly reminder that this issue had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants