fix ema checkpoint loading #156

JSabadin · 2025-01-22T10:24:03Z

We already support loading hyperparameters such as lr_schedulers, epoch, global_step, optimizer_steps, and state_dict (including EMA weights if used) via the --resume flag in the train command. This PR:

Fixes a bug in the EMA implementation.
Ensures proper handling of the on_load_checkpoint hook, which is called before the on_fit_start hook.
Introduces a new parameter trainer.resume_training to resume training from checkpoints specified in the config file (model.weights).

codecov · 2025-01-22T11:29:08Z

Codecov Report

Attention: Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.76%. Comparing base (631b905) to head (fd09c8c).
Report is 35 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
luxonis_train/core/core.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #156      +/-   ##
==========================================
- Coverage   96.31%   95.76%   -0.56%     
==========================================
  Files         147      170      +23     
  Lines        6304     7552    +1248     
==========================================
+ Hits         6072     7232    +1160     
- Misses        232      320      +88

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

klemen1999

LGTM

luxonis_train/__main__.py

fix ema checkpoint loading order

859b548

JSabadin requested a review from a team as a code owner January 22, 2025 10:24

JSabadin requested review from kozlov721, klemen1999, tersekmatija and conorsim and removed request for a team January 22, 2025 10:24

github-actions bot assigned JSabadin Jan 22, 2025

github-actions bot added the enhancement New feature or request label Jan 22, 2025

fix ema test

181635f

JSabadin changed the title ~~fix ema checkpoint loading order~~ fix ema checkpoint loading Jan 22, 2025

new resume_training param

7ffc7ef

github-actions bot added documentation Improvements or additions to documentation CLI Changes affecting the CLI labels Jan 22, 2025

fix failing type-check

b152a6a

kozlov721 approved these changes Jan 23, 2025

View reviewed changes

klemen1999 approved these changes Jan 23, 2025

View reviewed changes

luxonis_train/__main__.py Outdated Show resolved Hide resolved

JSabadin added 2 commits January 23, 2025 19:10

rename to resume_weights

7fa3450

rename to resume_weights

fd09c8c

JSabadin merged commit 7e4d39d into main Jan 25, 2025
7 of 9 checks passed

JSabadin deleted the feat/read-hyperparameters-from-checkpoint branch January 25, 2025 06:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix ema checkpoint loading #156

fix ema checkpoint loading #156

JSabadin commented Jan 22, 2025 •

edited

Loading

codecov bot commented Jan 22, 2025 •

edited

Loading

klemen1999 left a comment

fix ema checkpoint loading #156

fix ema checkpoint loading #156

Conversation

JSabadin commented Jan 22, 2025 • edited Loading

codecov bot commented Jan 22, 2025 • edited Loading

Codecov Report

klemen1999 left a comment

Choose a reason for hiding this comment

JSabadin commented Jan 22, 2025 •

edited

Loading

codecov bot commented Jan 22, 2025 •

edited

Loading