Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ema checkpoint loading #156

Merged
merged 6 commits into from
Jan 25, 2025
Merged

Conversation

JSabadin
Copy link
Contributor

@JSabadin JSabadin commented Jan 22, 2025

We already support loading hyperparameters such as lr_schedulers, epoch, global_step, optimizer_steps, and state_dict (including EMA weights if used) via the --resume flag in the train command. This PR:

  1. Fixes a bug in the EMA implementation.
  2. Ensures proper handling of the on_load_checkpoint hook, which is called before the on_fit_start hook.
  3. Introduces a new parameter trainer.resume_training to resume training from checkpoints specified in the config file (model.weights).

@JSabadin JSabadin requested a review from a team as a code owner January 22, 2025 10:24
@JSabadin JSabadin requested review from kozlov721, klemen1999, tersekmatija and conorsim and removed request for a team January 22, 2025 10:24
@github-actions github-actions bot added the enhancement New feature or request label Jan 22, 2025
@JSabadin JSabadin changed the title fix ema checkpoint loading order fix ema checkpoint loading Jan 22, 2025
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

Attention: Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.76%. Comparing base (631b905) to head (fd09c8c).
Report is 35 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
luxonis_train/core/core.py 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #156      +/-   ##
==========================================
- Coverage   96.31%   95.76%   -0.56%     
==========================================
  Files         147      170      +23     
  Lines        6304     7552    +1248     
==========================================
+ Hits         6072     7232    +1160     
- Misses        232      320      +88     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot added documentation Improvements or additions to documentation CLI Changes affecting the CLI labels Jan 22, 2025
Copy link
Collaborator

@klemen1999 klemen1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

luxonis_train/__main__.py Outdated Show resolved Hide resolved
@JSabadin JSabadin merged commit 7e4d39d into main Jan 25, 2025
7 of 9 checks passed
@JSabadin JSabadin deleted the feat/read-hyperparameters-from-checkpoint branch January 25, 2025 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI Changes affecting the CLI documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants