- 1. Introduction
- 2. Environment
- 3. Model Training / Evaluation / Prediction
- 4. Inference and Deployment
- 5. FAQ
Paper:
SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping
CVPR, 2020
Using MJSynth and SynthText two text recognition datasets for training, and evaluating on IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE datasets, the algorithm reproduction effect is as follows:
Model | Backbone | ACC | config | Download link |
---|---|---|---|---|
SEED | Aster_Resnet | 85.2% | configs/rec/rec_resnet_stn_bilstm_att.yml | 训练模型 |
Please refer to "Environment Preparation" to configure the PaddleOCR environment, and refer to "Project Clone" to clone the project code.
Please refer to Text Recognition Tutorial. PaddleOCR modularizes the code, and training different recognition models only requires changing the configuration file.
Training:
The SEED model needs to additionally load the language model trained by FastText, and install the fasttext dependencies:
python3 -m pip install fasttext==0.9.1
Specifically, after the data preparation is completed, the training can be started. The training command is as follows:
#Single GPU training (long training period, not recommended)
python3 tools/train.py -c configs/rec/rec_resnet_stn_bilstm_att.yml
#Multi GPU training, specify the gpu number through the --gpus parameter
python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c rec_resnet_stn_bilstm_att.yml
Evaluation:
# GPU evaluation
python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy
Prediction:
# The configuration file used for prediction must match the training
python3 tools/infer_rec.py -c configs/rec/rec_resnet_stn_bilstm_att.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.infer_img=doc/imgs_words/en/word_1.png
Not support
Not support
Not support
Not support
@inproceedings{qiao2020seed,
title={Seed: Semantics enhanced encoder-decoder framework for scene text recognition},
author={Qiao, Zhi and Zhou, Yu and Yang, Dongbao and Zhou, Yucan and Wang, Weiping},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={13528--13537},
year={2020}
}