Prepare datasets

It is recommended to symlink the dataset root to $PAN_PP.PYTORCH/data. If your folder structure is different, you may need to change the corresponding paths in dataloader files.

pan_pp.pytorch
└── data
    ├── CTW1500
    │   ├── train
    │   │   ├── text_image
    │   │   └── text_label_curve
    │   └── test
    │       ├── text_image
    │       └── text_label_curve
    ├── total_text
    │   ├── Images
    │   │   ├── Train
    │   │   └── Test
    │   └── Groundtruth
    │       ├── Polygon
    │       └── Rectangular
    ├── ICDAR2015
    │   └── Challenge4
    │       ├── ch4_training_images
    │       ├── ch4_training_localization_transcription_gt
    │       ├── ch4_test_images
    │       └── ch4_test_localization_transcription_gt
    ├── MSRA-TD500
    │   ├── train
    │   └── test
    ├── HUST-TR400
    ├── COCO-Text
    │   └── train2014
    ├── SynthText
    │   ├── 1
    │   ├── ...
    │   └── 200
    └── ICDAR2017-MLT
        ├── ch8_training_images
        ├── ch8_validation_images
        ├── ch8_training_localization_transcription_gt_v2
        └── ch8_validation_localization_transcription_gt_v2

Download

These datasets can be downloaded from the following links:

CTW1500 [dataset]
Total-Text [image] [gt]
ICDAR2015 [dataset]
MSRA-TD500 [dataset]
HUST-TR400 [dataset]
SynthText [dataset]
ICDAR2017-MLT [dataset]
COCO-Text [dataset]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Prepare datasets

Download

Files

README.md

Latest commit

History

README.md

File metadata and controls

Prepare datasets

Download