OCR recognition model #158

sokovninn · 2025-01-23T21:36:47Z

New OCR recognition model, loss, metric and visualizer

The most important changes are summarized below:

Losses:

Introduced CTCLoss with optional focal loss weighting in luxonis_train/attached_modules/losses/ctc_loss.py and updated __init__.py to include CTCLoss. [1] [2] [3]
Updated luxonis_train/attached_modules/losses/README.md to document CTCLoss.

Metrics:

Added OCRAccuracy metric for OCR tasks in luxonis_train/attached_modules/metrics/ocr_accuracy.py and updated __init__.py to include OCRAccuracy. [1] [2] [3]
Updated luxonis_train/attached_modules/metrics/README.md to document OCRAccuracy.

Visualizers:

Introduced OCRVisualizer for visualizing OCR tasks in luxonis_train/attached_modules/visualizers/ocr_visualizer.py and updated __init__.py to include OCRVisualizer. [1] [2] [3]
Updated luxonis_train/attached_modules/visualizers/README.md to document OCRVisualizer.

Predefined Models:

Added OCRRecognitionModel to luxonis_train/config/predefined_models/__init__.py and updated README.md to document its components and parameters. [1] [2] [3]

Toy dataset creation example

def toy_ocr_generator():
    im_paths = glob.glob("*.png")
    labels = [os.path.splitext(os.path.basename(path))[0] for path in im_paths]
    for path, label in tqdm(zip(im_paths, labels)):
        if len(label):
            yield {
                "file": path,
                "annotation": {
                    "metadata": {"text": label},
                },
            }

Examples from the overfitted model on the toy dataset

…feat/ocr-recognition

sokovninn · 2025-01-23T22:38:23Z

Possible improvements include:

Adding more advanced OCR metrics
Adding a temporal NRTR head together with NRTRLoss
Improving visualization
Adding a large variant
Improving encoder to handle more edge cases
Adding a beam search decoder
Adding OCR-specific augmentations
Adding OCR detection model (same backbone)

klemen1999

Generally LGTM, left some comments. One thing that we want to also make sure is the integration with HubAI and depthai-nodes - the archived model should have correct archive data so that the parser can parse it.

luxonis_train/attached_modules/visualizers/ocr_visualizer.py

luxonis_train/config/predefined_models/README.md

luxonis_train/config/predefined_models/ocr_recognition_model.py

luxonis_train/nodes/backbones/pplcnet_v3/blocks.py

luxonis_train/nodes/backbones/pplcnet_v3/pplcnet_v3.py

luxonis_train/nodes/heads/ocr_ctc_head.py

luxonis_train/nodes/necks/svtr_neck/blocks.py

codecov · 2025-01-27T18:19:24Z

Codecov Report

Attention: Patch coverage is 88.88889% with 78 lines in your changes missing coverage. Please review.

Project coverage is 94.23%. Comparing base (631b905) to head (32f4bfc).
Report is 40 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
luxonis_train/nodes/necks/svtr_neck/blocks.py	75.00%	37 Missing ⚠️
luxonis_train/nodes/heads/ocr_ctc_head.py	79.03%	13 Missing ⚠️
...nis_train/nodes/backbones/pplcnet_v3/pplcnet_v3.py	84.48%	9 Missing ⚠️
luxonis_train/utils/ocr.py	89.33%	8 Missing ⚠️
luxonis_train/loaders/utils.py	22.22%	7 Missing ⚠️
luxonis_train/nodes/backbones/pplcnet_v3/blocks.py	97.69%	3 Missing ⚠️
.../config/predefined_models/ocr_recognition_model.py	97.61%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #158      +/-   ##
==========================================
- Coverage   96.31%   94.23%   -2.09%     
==========================================
  Files         147      202      +55     
  Lines        6304     9349    +3045     
==========================================
+ Hits         6072     8810    +2738     
- Misses        232      539     +307

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kozlov721

In general LGTM, left some small comments

configs/ocr_recognition_light_model.yaml

luxonis_train/attached_modules/metrics/ocr_accuracy.py

luxonis_train/loaders/utils.py

klemen1999

Generally LGTM, just couple of small comments

luxonis_train/config/predefined_models/ocr_recognition_model.py

luxonis_train/nodes/README.md

…feat/ocr-recognition

sokovninn added 4 commits January 23, 2025 20:52

fix: replace slashes in inference image paths

85d1fbc

feat: add OCR recognition model

81e176e

test: simple ocr test

73c412b

fix: collate_fn handling of string lists

4ac2a30

sokovninn requested a review from a team as a code owner January 23, 2025 21:36

sokovninn requested review from kozlov721, klemen1999, tersekmatija and conorsim and removed request for a team January 23, 2025 21:36

sokovninn added 2 commits January 23, 2025 21:40

Merge branch 'main' of https://github.com/luxonis/luxonis-train into …

38091a9

…feat/ocr-recognition

style: formatting

e78a79a

github-actions bot assigned sokovninn Jan 23, 2025

github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 23, 2025

klemen1999 reviewed Jan 24, 2025

View reviewed changes

sokovninn added 13 commits January 24, 2025 15:03

fix: custom head config params

f89908c

fix: suppress misleading warning

3e38b62

fix: remove bias in LearnableRepLayer for correct reparametrization

95018e3

fix: ocr encoder char_to_int dict

865eae2

fix: ocr ctc head

81fbe23

feat: add light variant

b66d85f

refactor: refactoring

c54aa88

docs: improve READMEs

457a70f

Merge branch 'main' into feat/ocr-recognition

581a652

fix: rename model to light

2a58dfb

fix: typing

52b9602

style: formatting

6318b9b

fix: typing

d8a569b

sokovninn added 3 commits January 27, 2025 15:55

fix: different device error

0c2473d

test: exclude PPLCNetV3 from the segmentation test

76cddf5

test: add mnist dataset for ocr recognition testing

92502d8

sokovninn requested a review from klemen1999 January 27, 2025 18:29

kozlov721 approved these changes Jan 27, 2025

View reviewed changes

configs/ocr_recognition_light_model.yaml Outdated Show resolved Hide resolved

kozlov721 reviewed Jan 27, 2025

View reviewed changes

luxonis_train/attached_modules/metrics/ocr_accuracy.py Outdated Show resolved Hide resolved

kozlov721 reviewed Jan 27, 2025

View reviewed changes

luxonis_train/loaders/utils.py Outdated Show resolved Hide resolved

sokovninn added 4 commits January 28, 2025 16:00

feat: simplify ocr dataset and batch creation

8a397bf

fix: remove redundant masks array

d435960

fix: typing

73ceb0c

feat: add predefined alphabets

d6d50e9

sokovninn requested a review from kozlov721 January 28, 2025 17:06

klemen1999 approved these changes Jan 28, 2025

View reviewed changes

luxonis_train/config/predefined_models/ocr_recognition_model.py Outdated Show resolved Hide resolved

luxonis_train/nodes/README.md Outdated Show resolved Hide resolved

sokovninn added 2 commits January 28, 2025 21:03

docs: fix default value in README

64bd32d

docs: predefined alphabets info

823229b

klemen1999 approved these changes Jan 29, 2025

View reviewed changes

sokovninn added 6 commits January 29, 2025 13:06

test: remove text_length from test dataset generator

2f3f93d

Merge branch 'main' of https://github.com/luxonis/luxonis-train into …

35b68ec

…feat/ocr-recognition

fix: collate_fn metadata task processing

8565723

docs: update OCR light model speed

052d61e

fix: seaborn version

44f40ba

fix: smooth bce loss generics

32f4bfc

sokovninn merged commit 7b1fccd into main Jan 29, 2025
7 of 10 checks passed

sokovninn deleted the feat/ocr-recognition branch January 29, 2025 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR recognition model #158

OCR recognition model #158

sokovninn commented Jan 23, 2025 •

edited

Loading

sokovninn commented Jan 23, 2025 •

edited

Loading

klemen1999 left a comment

codecov bot commented Jan 27, 2025 •

edited

Loading

kozlov721 left a comment

klemen1999 left a comment

OCR recognition model #158

OCR recognition model #158

Conversation

sokovninn commented Jan 23, 2025 • edited Loading

New OCR recognition model, loss, metric and visualizer

Losses:

Metrics:

Visualizers:

Predefined Models:

Toy dataset creation example

Examples from the overfitted model on the toy dataset

sokovninn commented Jan 23, 2025 • edited Loading

klemen1999 left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 27, 2025 • edited Loading

Codecov Report

kozlov721 left a comment

Choose a reason for hiding this comment

klemen1999 left a comment

Choose a reason for hiding this comment

sokovninn commented Jan 23, 2025 •

edited

Loading

sokovninn commented Jan 23, 2025 •

edited

Loading

codecov bot commented Jan 27, 2025 •

edited

Loading