This project demonstrates how we can build a deep neural network with Connectionist Temporal Classification loss function for reading captcha. The model is based on the paper An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition (2015), Baoguang Shi, Xiang Bai & Cong Yao. Please see the below figure for the architecture of CRNN model on this project :
Total 1,800 different 104 x 24 captcha images, with 4 maximum length for each image.
Image sourse: Taiwan Insurance Institute
Evaluate the CRNN model on the testing dataset(total 100 diffrent captcha images), which has never been used in training.
- Only one false prediction: the inference of 7041.png is 7091.
- Local Version
- Colab Version
[1] A_K_Nain (2020), OCR model for reading Captchas
[2] Awni Hannun (2017), Sequence Modeling with CTC
[3] Understanding CTC loss for speech recognition in Keras
[4] Baoguang Shi, Xiang Bai & Cong Yao (2015), An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
[5] Alex Graves, Santiago Fernández, Faustino Gomez & Jürgen Schmidhuber (2006), Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks
- © Tom Wu (Github)
Please cite this repository CRNN_with_CTC_Loss if you use it.