Releases: felixdittrich92/OnnxTR
Releases · felixdittrich92/OnnxTR
v0.5.1
What's Changed
- Improved
result.syntesize()
- Updated Hugging Face demo
Full Changelog: v0.5.0...v0.5.1
v0.5.0
What's Changed
New version specifiers
To go further forward making OnnxTR
the choice for production scenarios 2 new installation options was added:
pip install "onnxtr[cpu-headless]" # same as "onnxtr[cpu]" but with opencv-headless
pip install "onnxtr[gpu-headless]" # same as "onnxtr[gpu]" but with opencv-headless
Disable page orientation classification
- If you deal with documents which contains only small rotations (~ -45 to 45 degrees), you can disable the page orientation classification to speed up the inference.
- This will only have an effect with
assume_straight_pages=False
and/orstraighten_pages=True
and/ordetect_orientation=True
.
from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_page_orientation=True)
Disable crop orientation classification
- If you deal with documents which contains only horizontal text, you can disable the crop orientation classification to speed up the inference.
- This will only have an effect with
assume_straight_pages=False
and/orstraighten_pages=True
.
from onnxtr.models import ocr_predictor
model = ocr_predictor(assume_straight_pages=False, disable_crop_orientation=True)
Loading custom exported orientation classification models
Syncronized with docTR
:
from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, mobilenet_v3_small_page_orientation, mobilenet_v3_small_crop_orientation
from onnxtr.models.classification.zoo import crop_orientation_predictor, page_orientation_predictor
custom_page_orientation_model = mobilenet_v3_small_page_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>")
custom_crop_orientation_model = mobilenet_v3_small_crop_orientation("<PATH_TO_CUSTOM_EXPORTED_ONNX_MODEL>"))
predictor = ocr_predictor(assume_straight_pages=False, detect_orientation=True)
# Overwrite the default orientation models
predictor.crop_orientation_predictor = crop_orientation_predictor(custom_crop_orientation_model)
predictor.page_orientation_predictor = page_orientation_predictor(custom_page_orientation_model)
FP16 Support
- GPU only feature (OnnxTR needs to run on GPU)
- Added a script which can be used to convert the default FP32 models to FP16 (Input / Output will be unchanged fp32), this will further speed up the inference on GPU and lower the required VRAM
- Script is available at: https://github.com/felixdittrich92/OnnxTR/blob/main/scripts/convert_to_float16.py
Full Changelog: v0.4.1...v0.5.0
v0.4.1
What's Changed
- Fix:
straighten_pages=True
now also displayed with.show()
correctly - Added numpy 2.0 support
New Contributors
- @dependabot made their first contribution in #17
Full Changelog: v0.4.0...v0.4.1
v0.4.0
What's Changed
- Sync with current docTR state
- Hf hub integration
HuggingFace Hub integration
Now you can load and/or push models to the hub directly.
Loading
from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor, from_hub
img = DocumentFile.from_images(['<image_path>'])
# Load your model from the hub
model = from_hub('onnxtr/my-model')
# Pass it to the predictor
# If your model is a recognition model:
predictor = ocr_predictor(
det_arch='db_mobilenet_v3_large',
reco_arch=model
)
# If your model is a detection model:
predictor = ocr_predictor(
det_arch=model,
reco_arch='crnn_mobilenet_v3_small'
)
# Get your predictions
res = predictor(img)
Push
from onnxtr.models import parseq, push_to_hf_hub, login_to_hub
from onnxtr.utils.vocabs import VOCABS
# Login to the hub
login_to_hub()
# Recogniton model
model = parseq("~/onnxtr-parseq-multilingual-v1.onnx", vocab=VOCABS["multilingual"])
push_to_hf_hub(
model,
model_name="onnxtr-parseq-multilingual-v1",
task="recognition", # The task for which the model is intended [detection, recognition, classification]
arch="parseq", # The name of the model architecture
override=False # Set to `True` if you want to override an existing model / repository
)
# Detection model
model = linknet_resnet18("~/onnxtr-linknet-resnet18.onnx")
push_to_hf_hub(
model,
model_name="onnxtr-linknet-resnet18",
task="detection",
arch="linknet_resnet18",
override=True
)
HF Hub search: here.
Collection: here
Full Changelog: v0.3.2...v0.4.0
v0.3.2
What's Changed
Full Changelog: v0.3.1...v0.3.2
v0.3.1
What's Changed
- Minor configuration fix for CUDAExecutionProvider
- Adjusted default batch sizes
- avoid init EngineConfig multiple times
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Sync with current docTR state
- Added advanced options to configure the underlying execution engine
- Added new
db_mobilenet_v3_large
converted models (fp32 & 8bit)
Advanced engine configuration
from onnxruntime import SessionOptions
from onnxtr.models import ocr_predictor, EngineConfig
general_options = SessionOptions() # For configuartion options see: https://onnxruntime.ai/docs/api/python/api_summary.html#sessionoptions
general_options.enable_cpu_mem_arena = False
# NOTE: The following would force to run only on the GPU if no GPU is available it will raise an error
# List of strings e.g. ["CUDAExecutionProvider", "CPUExecutionProvider"] or a list of tuples with the provider and its options e.g.
# [("CUDAExecutionProvider", {"device_id": 0}), ("CPUExecutionProvider", {"arena_extend_strategy": "kSameAsRequested"})]
providers = [("CUDAExecutionProvider", {"device_id": 0})] # For available providers see: https://onnxruntime.ai/docs/execution-providers/
engine_config = EngineConfig(
session_options=general_options,
providers=providers
)
# We use the default predictor with the custom engine configuration
# NOTE: You can define different engine configurations for detection, recognition and classification depending on your needs
predictor = ocr_predictor(
det_engine_cfg=engine_config,
reco_engine_cfg=engine_config,
clf_engine_cfg=engine_config
)
Full Changelog: v0.2.0...v0.3.0
v0.2.0
What's Changed
- Added 8-Bit quantized models
- Added Dockerfile and CI for CPU/GPU Usage
8-Bit quantized models
8-Bit quantized variants of all models was added (expect: the FAST models - which are already reparameterized)
from onnxtr.models import ocr_predictor, detection_predictor, recognition_predictor
predictor = ocr_predictor(det_arch="db_resnet50", reco_arch="crnn_vgg16_bn", load_in_8_bit=True)
det_predictor = detection_predictor("db_resnet50", load_in_8_bit=True)
reco_predictor = recognition_predictor("parseq", load_in_8_bit=True)
- CPU benchmarks:
Library | FUNSD (199 pages) | CORD (900 pages) |
---|---|---|
docTR (CPU) - v0.8.1 | ~1.29s / Page | ~0.60s / Page |
OnnxTR (CPU) - v0.1.2 | ~0.57s / Page | ~0.25s / Page |
OnnxTR (CPU) 8-bit - v0.1.2 | ~0.38s / Page | ~0.14s / Page |
EasyOCR (CPU) - v1.7.1 | ~1.96s / Page | ~1.75s / Page |
PyTesseract (CPU) - v0.3.10 | ~0.50s / Page | ~0.52s / Page |
Surya (line) (CPU) - v0.4.4 | ~48.76s / Page | ~35.49s / Page |
v0.1.2
This release:
- Fix some typos
- update Readme and add a first minimal benchmark
- clean build dependencies
v0.1.1
This release:
- split dependencies in cpu and gpu