Create a directory to store reid datasets under this repo via
cd deep-person-reid/
mkdir data/
If you wanna store datasets in another directory, you need to specify --root path_to_your/data
when running the training code. Please follow the instructions below to prepare each dataset. After that, you can simply do -d the_dataset
when running the training code.
Please do not call image dataset when running video reid scripts, otherwise error would occur, and vice versa.
Market1501 [7]:
- Download dataset to
data/
from http://www.liangzheng.org/Project/project_reid.html. - Extract dataset and rename to
market1501
. The data structure would look like:
market1501/
bounding_box_test/
bounding_box_train/
...
- Use
-d market1501
when running the training code.
CUHK03 [13]:
- Create a folder named
cuhk03/
underdata/
. - Download dataset to
data/cuhk03/
from http://www.ee.cuhk.edu.hk/~xgwang/CUHK_identification.html and extractcuhk03_release.zip
, so you will havedata/cuhk03/cuhk03_release
. - Download new split [14] from person-re-ranking. What you need are
cuhk03_new_protocol_config_detected.mat
andcuhk03_new_protocol_config_labeled.mat
. Put these two mat files underdata/cuhk03
. Finally, the data structure would look like
cuhk03/
cuhk03_release/
cuhk03_new_protocol_config_detected.mat
cuhk03_new_protocol_config_labeled.mat
...
- Use
-d cuhk03
when running the training code. In default mode, we use new split (767/700). If you wanna use the original splits (1367/100) created by [13], specify--cuhk03-classic-split
. As [13] computes CMC differently from Market1501, you might need to specify--use-metric-cuhk03
for fair comparison with their method. In addition, we support bothlabeled
anddetected
modes. The default mode loadsdetected
images. Specify--cuhk03-labeled
if you wanna train and test onlabeled
images.
DukeMTMC-reID [16, 17]:
- The process is automated, please use
-d dukemtmcreid
when running the training code. The final folder structure looks like as follows
dukemtmc-reid/
DukeMTMC-reid.zip # (you can delete this zip file, it is ok)
DukeMTMC-reid/
MSMT17 [22]:
- Create a directory named
msmt17/
underdata/
. - Download dataset
MSMT17_V1.tar.gz
todata/msmt17/
from http://www.pkuvmc.com/publications/msmt17.html. Extract the file under the same folder, so you will have
msmt17/
MSMT17_V1.tar.gz # (do whatever you want with this .tar file)
MSMT17_V1/
train/
test/
list_train.txt
... (totally six .txt files)
- Use
-d msmt17
when running the training code.
VIPeR [28]:
- The code supports automatic download and formatting. Just use
-d viper
as usual. The final data structure would look like:
viper/
VIPeR/
VIPeR.v1.0.zip # useless
splits.json
GRID [29]:
- The code supports automatic download and formatting. Just use
-d grid
as usual. The final data structure would look like:
grid/
underground_reid/
underground_reid.zip # useless
splits.json
CUHK01 [30]:
- Create
cuhk01/
underdata/
or your custom data dir. - Download
CUHK01.zip
from http://www.ee.cuhk.edu.hk/~xgwang/CUHK_identification.html and place it incuhk01/
. - Do
-d cuhk01
to use the data.
PRID450S [31]:
- The code supports automatic download and formatting. Just use
-d prid450s
as usual. The final data structure would look like:
prid450s/
cam_a/
cam_b/
readme.txt
splits.json
SenseReID [32]:
- Create
sensereid/
underdata/
or your custom data dir. - Download dataset from this link and extract to
sensereid/
. The final folder structure should look like
sensereid/
SenseReID/
test_probe/
test_gallery/
- The command for using SenseReID is
-d sensereid
. Note that SenseReID is for test purpose only so training images are unavailable. Please use--evaluate
along with-d sensereid
.
MARS [8]:
- Create a directory named
mars/
underdata/
. - Download dataset to
data/mars/
from http://www.liangzheng.com.cn/Project/project_mars.html. - Extract
bbox_train.zip
andbbox_test.zip
. - Download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put
info/
indata/mars
(we want to follow the standard split in [8]). The data structure would look like:
mars/
bbox_test/
bbox_train/
info/
- Use
-d mars
when running the training code.
iLIDS-VID [11]:
- The code supports automatic download and formatting. Simple use
-d ilidsvid
when running the training code. The data structure would look like:
ilids-vid/
i-LIDS-VID/
train-test people splits/
splits.json
PRID [12]:
- Under
data/
, domkdir prid2011
to create a directory. - Download dataset from https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/PRID11/ and extract it under
data/prid2011
. - Download the split created by iLIDS-VID from here, and put it in
data/prid2011/
. We follow [11] and use 178 persons whose sequences are more than a threshold so that results on this dataset can be fairly compared with other approaches. The data structure would look like:
prid2011/
splits_prid2011.json
prid_2011/
multi_shot/
single_shot/
readme.txt
- Use
-d prid2011
when running the training code.
DukeMTMC-VideoReID [16, 23]:
- Use
-d dukemtmcvidreid
directly. - If you wanna download the dataset manually, get
DukeMTMC-VideoReID.zip
from https://github.com/Yu-Wu/DukeMTMC-VideoReID. Unzip the file todata/dukemtmc-vidreid
. Ultimately, you need to have
dukemtmc-vidreid/
DukeMTMC-VideoReID/
train/ # essential
query/ # essential
gallery/ # essential
... (and license files)
These are implemented in dataset_loader.py
where we have two main classes that subclass torch.utils.data.Dataset:
- ImageDataset: processes image-based person reid datasets.
- VideoDataset: processes video-based person reid datasets.
These two classes are used for torch.utils.data.DataLoader that can provide batched data. Data loader wich ImageDataset
outputs batch data of (batch, channel, height, width)
, while data loader with VideoDataset
outputs batch data of (batch, sequence, channel, height, width)
.