Note: This tutorial mainly introduces the usage of PP-OCR series models, please refer to PP-Structure Quick Start for the quick use of document analysis related functions.
If you do not have a Python environment, please refer to Environment Preparation.
-
If you have CUDA 9 or CUDA 10 installed on your machine, please run the following command to install
python3 -m pip install paddlepaddle-gpu
-
If you have no available GPU on your machine, please run the following command to install the CPU version
python3 -m pip install paddlepaddle
For more software version requirements, please refer to the instructions in Installation Document for operation.
pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
-
For windows users: If you getting this error
OSError: [WinError 126] The specified module could not be found
when you install shapely on windows. Please try to download Shapely whl file here.Reference: Solve shapely installation on windows
-
For layout analysis users, run the following command to install Layout-Parser
pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
PaddleOCR provides a series of test images, click here to download, and then switch to the corresponding directory in the terminal
cd /path/to/ppocr_img
If you do not use the provided test image, you can replace the following --image_dir
parameter with the corresponding test image path
-
Detection, direction classification and recognition: set the parameter
--use_gpu false
to disable the gpu devicepaddleocr --image_dir ./imgs_en/img_12.jpg --use_angle_cls true --lang en --use_gpu false
Output will be a list, each item contains bounding box, text and recognition confidence
[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)] [[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)] [[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)] ......
-
Only detection: set
--rec
tofalse
paddleocr --image_dir ./imgs_en/img_12.jpg --rec false
Output will be a list, each item only contains bounding box
[[397.0, 802.0], [1092.0, 802.0], [1092.0, 841.0], [397.0, 841.0]] [[397.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [397.0, 789.0]] [[397.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [397.0, 738.0]] ......
-
Only recognition: set
--det
tofalse
paddleocr --image_dir ./imgs_words_en/word_10.png --det false --lang en
Output will be a list, each item contains text and recognition confidence
['PAIN', 0.9934559464454651]
If you need to use the 2.0 model, please specify the parameter --ocr_version PP-OCR
, paddleocr uses the PP-OCRv3 model by default(--ocr_version PP-OCRv3
). More whl package usage can be found in whl package
PaddleOCR currently supports 80 languages, which can be switched by modifying the --lang
parameter.
paddleocr --image_dir ./doc/imgs_en/254.jpg --lang=en
[[[67.0, 51.0], [327.0, 46.0], [327.0, 74.0], [68.0, 80.0]], ('PHOCAPITAL', 0.9944712519645691)]
[[[72.0, 92.0], [453.0, 84.0], [454.0, 114.0], [73.0, 122.0]], ('107 State Street', 0.9744491577148438)]
[[[69.0, 135.0], [501.0, 125.0], [501.0, 156.0], [70.0, 165.0]], ('Montpelier Vermont', 0.9357033967971802)]
......
Commonly used multilingual abbreviations include
Language | Abbreviation | Language | Abbreviation | Language | Abbreviation | ||
---|---|---|---|---|---|---|---|
Chinese & English | ch | French | fr | Japanese | japan | ||
English | en | German | german | Korean | korean | ||
Chinese Traditional | chinese_cht | Italian | it | Russian | ru |
A list of all languages and their corresponding abbreviations can be found in Multi-Language Model Tutorial
- detection, angle classification and recognition:
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `fr`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = './imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
print(line)
# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
Output will be a list, each item contains bounding box, text and recognition confidence
[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
......
Visualization of results
In this section, you have mastered the use of PaddleOCR whl package.
PaddleOCR is a rich and practical OCR tool library that get through the whole process of data production, model training, compression, inference and deployment, please refer to the tutorials to start the journey of PaddleOCR.