Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Checkpoints-only feature and multiple checkpoints per file #8

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

amangarg96
Copy link

This PR is for adding two features:

  • Checkpoints-only: Only the checkpoints of the files/notebooks will be stored on HDFS (Files and Notebooks will still be on the local file system)
  • Multiple checkpoints per notebook: Allow multiple checkpoints per notebook. The current ContentsManager only allows one checkpoint per notebook.

Both of these features are inspired from PGContents, which is a PostgresQL-backed ContentsManager and supports these features.

For checkpoints-only, the attributes of the HDFSContentsManager class (hdfs_namenode_host, hdfs_namenode_port, root_dir, hdfs_user) were move to the Base class HDFSManagerMixin, so that the attributes are accessible by both HDFSCheckpoints and HDFSContentsManager.
The checkpoints are stored following the same relative path on the HDFS, as that of the Notebook on the local file system. If the Notebook server is started in <local_root> and a checkpoint is saved for a Notebook in <local_root>/Folder1/Notebook1.ipynb, then the checkpoints would be stored in <hdfs_root>/Folder1/.ipynb_checkpoints

For multiple checkpoints, the default value of checkpoint_id was changed from "checkpoint" to a number (starting from 1) which increments on every checkpoint. There's no limit on the number of checkpoints per file.
When a Notebook is deleted, all the associated checkpoints are also deleted.

Example configuration files for using Checkpoints-only and the whole ContentsManager have been included in the examples folder.

Note: JupyterLab UI does not support multiple checkpoints. You can only see the last checkpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant