Skip to content

Commit

Permalink
Set create=False for CheckpointManager if used in an eval job.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 698826017
  • Loading branch information
Conchylicultor authored and The kauldron Authors committed Nov 21, 2024
1 parent adf5d3c commit 69e940d
Showing 1 changed file with 1 addition and 8 deletions.
9 changes: 1 addition & 8 deletions kauldron/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,17 +54,10 @@ def main(_):
with _wu_error_handling(_POST_MORTEM.value):
eval_names = _EVAL_NAMES.value
cfg = _CONFIG.value
trainer: kd.train.Trainer = kd.konfig.resolve(cfg)
if eval_names is None:
trainer: kd.train.Trainer = kd.konfig.resolve(cfg)
trainer.train()
else:
# Orbax does not support CheckpointManagers creating the same root
# directory. By setting `create=False`, we ensure that the checkpoint
# manager does not create a new root directory.
if hasattr(cfg, "checkpointer"):
if hasattr(cfg.checkpointer, "create"):
cfg.checkpointer.create = False
trainer: kd.train.Trainer = kd.konfig.resolve(cfg)
trainer.continuous_eval(eval_names)


Expand Down

0 comments on commit 69e940d

Please sign in to comment.