Dataset | Year | Category | Source | Task | Language |
---|---|---|---|---|---|
ICDAR 2017 DOST | 2017 | Scene text | Video | Localization/Tracking/Recognition | English/Japanese |
USTB-VidTEXT | 2016 | Embedded caption | Video | Localization/Recognition | English/Chinese |
ICDAR 2015 Text in Videos | 2015 | Scene text | Video | Localization/Tracking/Recognition | English/Spanish/French/Japanese |
YouTube Video | 2014 | Embedded caption/Scene text | Video | Localization/Tracking/Recognition | English |
Merino-Gracia | 2014 | Scene text | Video | Tracking | English |
ICDAR 2013 Text in Videos | 2013 | Scene text | Video | Localization/Tracking/Recognition | English/Spanish/French/Japanese |
Minetto | 2011 | Scene text | Video | Localization/Tracking/Recognition | English |
SVT | 2010 | Scene text | Video frames | Localization/Recognition | English |
TREC | 2002 | Embedded caption/Scene text | Video frames | Search | English |
Demo images of USTB-VidTEXT dataset.
[1] Xu-Cheng Yin, Ze-Yu Zuo, Shu Tian, and Cheng-Lin Liu, "Text Detection, Tracking and Recognition in Video: A Comprehensive Survey," IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2752-2773, June 2016.