Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build new notebook container images by snapshotting an interactively created prototype notebook #35

Open
jiridanek opened this issue Aug 16, 2024 · 7 comments
Assignees

Comments

@jiridanek
Copy link
Member

jiridanek commented Aug 16, 2024

We're trying to explore multiple possibilities for user-friendly notebook container building. One approach proposed at 2024-08-15 meeting is to let users snapshot the filesystem of their running notebook pods, upload the result into container registry as a new image, and then many new notebook instances can be created from this image.

My preferred implementation idea follows what Kaniko does (https://github.com/GoogleContainerTools/kaniko/blob/main/docs/designdoc.md). The main difference is that we can (should) use multiple containers in the pod, as we don't need to run any additional commands in the container that's being snapshotted.

  1. create an ephemeral debug container using our snapshotter's image next to the user's container, in the same pod
  2. create another ephemeral debug container using the same base image as the user's container uses, override entrypoint so that nothing actually runs
  3. in the snapshotter, access the filesystems of the other containers through /proc/$PID/root, compute diff between the base image and user's image filesystems using the naive approach described in kaniko design doc, build a tar layer from the changed files, and either upload that directly, or combine it with the original base image to form a new ready-to-be-deployed image.

Skip the same directories that kaniko skips (/proc, ..., also skip any volume mounts).

Kubernetes itself is capable of "checkpointing" of running containers. https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/ this is also an avenue to explore, but it seems to require being a privileged user to perform this operation.

@jiridanek
Copy link
Member Author

/label kind/enhancement project/notebooks-v2
/assign jiridanek

@jiridanek

This comment was marked as outdated.

@jiridanek

This comment was marked as outdated.

Copy link

@jiridanek: The label(s) /label kind/enhancement cannot be applied. These labels are supported: tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, lifecycle/needs-triage. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

/label kind/enhancement

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thesuperzapper
Copy link
Member

There are a couple of problems with your proposed approach (but they might be acceptable):

  1. It requires the credentials to push to the registry to be accessible to the user
  2. Ephemeral containers are permanent, and will continue running for the lifetime of the notebook pod (I know, I know, it's ironic given the name)
  3. Kaniko has some image compatibility problems, so I wonder if buildah mount is better.

@jiridanek
Copy link
Member Author

Ephemeral containers are permanent, and will continue running for the lifetime of the notebook pod (I know, I know, it's ironic given the name)

They will stick around, but they need not continue running. They can be in Terminated state if a way is found to terminate them after the work is done, at which point they only takes up screenspace in kubectl describe but should not have any other impact, at least I hope so.

@thesuperzapper
Copy link
Member

Also interestingly, Kubernetes 1.31 added a new volume type for OCI images, which might help us do this more efficiently:

https://kubernetes.io/docs/tasks/configure-pod-container/image-volumes/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Needs Triage
Development

No branches or pull requests

2 participants