17 Oct 10:27

felixdittrich92

d17146c

v0.5.1 Latest

Latest

What's Changed

Improved result.syntesize()
Updated Hugging Face demo

Full Changelog: v0.5.0...v0.5.1

Assets 2

27 Sep 11:16

felixdittrich92

v0.5.0

8285068

v0.5.0

What's Changed

New version specifiers

To go further forward making OnnxTR the choice for production scenarios 2 new installation options was added:

pip install "onnxtr[cpu-headless]"  # same as "onnxtr[cpu]" but with opencv-headless
pip install "onnxtr[gpu-headless]"  # same as "onnxtr[gpu]" but with opencv-headless

Disable page orientation classification

If you deal with documents which contains only small rotations (~ -45 to 45 degrees), you can disable the page orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True and/or detect_orientation=True.

from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_page_orientation=True)

Disable crop orientation classification

If you deal with documents which contains only horizontal text, you can disable the crop orientation classification to speed up the inference.
This will only have an effect with assume_straight_pages=False and/or straighten_pages=True.

from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_crop_orientation=True)

Loading custom exported orientation classification models

Syncronized with docTR:

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, mobilenet_v3_small_page_orientation, mobilenet_v3_small_crop_orientation
from onnxtr.models.classification.zoo import crop_orientation_predictor, page_orientation_predictor
custom_page_orientation_model = mobilenet_v3_small_page_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>")
custom_crop_orientation_model = mobilenet_v3_small_crop_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>"))

predictor = ocr_predictor(assume_straight_pages=False, detect_orientation=True)

# Overwrite the default orientation models
predictor.crop_orientation_predictor = crop_orientation_predictor(custom_crop_orientation_model)
predictor.page_orientation_predictor = page_orientation_predictor(custom_page_orientation_model)

FP16 Support

GPU only feature (OnnxTR needs to run on GPU)
Added a script which can be used to convert the default FP32 models to FP16 (Input / Output will be unchanged fp32), this will further speed up the inference on GPU and lower the required VRAM
Script is available at: https://github.com/felixdittrich92/OnnxTR/blob/main/scripts/convert_to_float16.py

Full Changelog: v0.4.1...v0.5.0

Assets 2

21 Aug 06:50

felixdittrich92

v0.4.1

4bebdea

v0.4.1

What's Changed

Fix: straighten_pages=True now also displayed with .show() correctly
Added numpy 2.0 support

New Contributors

@dependabot made their first contribution in #17

Full Changelog: v0.4.0...v0.4.1

Contributors

dependabot

Assets 2

16 Aug 10:23

felixdittrich92

v0.4.0

679c5d6

v0.4.0

What's Changed

Sync with current docTR state
Hf hub integration

HuggingFace Hub integration

Now you can load and/or push models to the hub directly.

Loading

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, from_hub

img = DocumentFile.from_images(['<image_path>'])
# Load your model from the hub
model = from_hub('onnxtr/my-model')

# Pass it to the predictor
# If your model is a recognition model:
predictor = ocr_predictor(
    det_arch='db_mobilenet_v3_large',
    reco_arch=model
)

# If your model is a detection model:
predictor = ocr_predictor(
    det_arch=model,
    reco_arch='crnn_mobilenet_v3_small'
)

# Get your predictions
res = predictor(img)

Push

from onnxtr.models import parseq, push_to_hf_hub, login_to_hub
from onnxtr.utils.vocabs import VOCABS

# Login to the hub
login_to_hub()

# Recogniton model
model = parseq("~/onnxtr-parseq-multilingual-v1.onnx", vocab=VOCABS["multilingual"])
push_to_hf_hub(
    model,
    model_name="onnxtr-parseq-multilingual-v1",
    task="recognition",  # The task for which the model is intended [detection, recognition, classification]
    arch="parseq",  # The name of the model architecture
    override=False  # Set to `True` if you want to override an existing model / repository
)

# Detection model
model = linknet_resnet18("~/onnxtr-linknet-resnet18.onnx")
push_to_hf_hub(
    model,
    model_name="onnxtr-linknet-resnet18",
    task="detection",
    arch="linknet_resnet18",
    override=True
)

HF Hub search: here.

Collection: here

Full Changelog: v0.3.2...v0.4.0

Assets 2

09 Jul 09:46

felixdittrich92

v0.3.2

7391db8

v0.3.2

What's Changed

Fix: Resize transformation / interpolation adjusted to docTR (#10 #22)

Full Changelog: v0.3.1...v0.3.2

Assets 2

28 Jun 06:16

felixdittrich92

v0.3.1

890ae43

v0.3.1

What's Changed

Minor configuration fix for CUDAExecutionProvider
Adjusted default batch sizes
avoid init EngineConfig multiple times

Full Changelog: v0.3.0...v0.3.1

Assets 2

27 Jun 10:13

felixdittrich92

v0.3.0

04f5744

v0.3.0

What's Changed

Sync with current docTR state
Added advanced options to configure the underlying execution engine
Added new db_mobilenet_v3_large converted models (fp32 & 8bit)

Advanced engine configuration

from onnxruntime import SessionOptions

from onnxtr.models import ocr_predictor, EngineConfig

general_options = SessionOptions()  # For configuartion options see: https://onnxruntime.ai/docs/api/python/api_summary.html#sessionoptions
general_options.enable_cpu_mem_arena = False

# NOTE: The following would force to run only on the GPU if no GPU is available it will raise an error
# List of strings e.g. ["CUDAExecutionProvider", "CPUExecutionProvider"] or a list of tuples with the provider and its options e.g.
# [("CUDAExecutionProvider", {"device_id": 0}), ("CPUExecutionProvider", {"arena_extend_strategy": "kSameAsRequested"})]
providers = [("CUDAExecutionProvider", {"device_id": 0})]  # For available providers see: https://onnxruntime.ai/docs/execution-providers/

engine_config = EngineConfig(
    session_options=general_options,
    providers=providers
)
# We use the default predictor with the custom engine configuration
# NOTE: You can define different engine configurations for detection, recognition and classification depending on your needs
predictor = ocr_predictor(
    det_engine_cfg=engine_config,
    reco_engine_cfg=engine_config,
    clf_engine_cfg=engine_config
)

Full Changelog: v0.2.0...v0.3.0

Assets 2

13 May 10:40

felixdittrich92

v0.2.0

162a10d

v0.2.0

What's Changed

Added 8-Bit quantized models
Added Dockerfile and CI for CPU/GPU Usage

8-Bit quantized models

8-Bit quantized variants of all models was added (expect: the FAST models - which are already reparameterized)

from onnxtr.models import ocr_predictor, detection_predictor, recognition_predictor

predictor = ocr_predictor(det_arch="db_resnet50", reco_arch="crnn_vgg16_bn", load_in_8_bit=True)

det_predictor = detection_predictor("db_resnet50", load_in_8_bit=True)
reco_predictor = recognition_predictor("parseq", load_in_8_bit=True)

CPU benchmarks:

Library	FUNSD (199 pages)	CORD (900 pages)
docTR (CPU) - v0.8.1	~1.29s / Page	~0.60s / Page
OnnxTR (CPU) - v0.1.2	~0.57s / Page	~0.25s / Page
OnnxTR (CPU) 8-bit - v0.1.2	~0.38s / Page	~0.14s / Page
EasyOCR (CPU) - v1.7.1	~1.96s / Page	~1.75s / Page
PyTesseract (CPU) - v0.3.10	~0.50s / Page	~0.52s / Page
Surya (line) (CPU) - v0.4.4	~48.76s / Page	~35.49s / Page

Assets 4

10 May 13:51

felixdittrich92

v0.1.2

eb18c72

v0.1.2

This release:

Fix some typos
update Readme and add a first minimal benchmark
clean build dependencies

Assets 18

10 May 09:20

felixdittrich92

v0.1.1

1cd537e

v0.1.1

This release:

split dependencies in cpu and gpu

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

What's Changed

New version specifiers

Disable page orientation classification

Disable crop orientation classification

Loading custom exported orientation classification models

FP16 Support

What's Changed

New Contributors

Contributors

What's Changed

HuggingFace Hub integration

Loading

Push

What's Changed

What's Changed

What's Changed

Advanced engine configuration

What's Changed

8-Bit quantized models

Releases: felixdittrich92/OnnxTR

v0.5.1

What's Changed

v0.5.0

What's Changed

New version specifiers

Disable page orientation classification

Disable crop orientation classification

Loading custom exported orientation classification models

FP16 Support

v0.4.1

What's Changed

New Contributors

Contributors

v0.4.0

What's Changed

HuggingFace Hub integration

Loading

Push

v0.3.2

What's Changed

v0.3.1

What's Changed

v0.3.0

What's Changed

Advanced engine configuration

v0.2.0

What's Changed

8-Bit quantized models

v0.1.2

v0.1.1