Source code for the ICCV-2019 paper "Co-segmentation Inspired Attention Networks for Video-based Person Re-identification". Our paper can be found here.
Our work attempts to tackle some of the challenges in Video-based Person re-identification (Re-ID) such as Background clutter, Misalignment error and partial occlusion by means of an co-segmentation inspired approach. The intention is to attend to the task-dependent common portions of the images (i.e., video frames of a person) that may aid the network in better focusing on most relevant features. This repository contains code for Co-segmentation inspired Re-ID architecture, “Co-segmentation Activation Module (COSAM)". Co-segmentation masks are “Interpretable” and helps to understand how and where the network attends to when creating a description about the person.
The source code is built upon the github repositories Video-Person-ReID (from jiyanggao) and deep-person-reid (from KaiyangZhou). Mainly, the data-loading, data-sampling and training part are borrowed from their repository. The strong baseline performances are based on the models from the codebase Video-Person-ReID. Check out their papers Revisiting Temporal Modeling for Video-based Person ReID (Gao et al.,), OSNet (Zhou et al., ICCV 2019).
We would like to thank jiyanggao and KaiyangZhou for their generous contribution to release the code to the community.
Dataset preparation instructions can be found in the repositories Video-Person-ReID and deep-person-reid. For completeness, I have compiled the dataset instructions here.
python main_video_person_reid.py -a resnet50_cosam45_tp -d <dataset> --gpu-devices <gpu_index>
<dataset>
can be mars
or dukemtmcvidreid
python main_video_person_reid.py -a resnet50_cosam45_tp -d <dataset> --gpu-devices <gpu_index> --evaluate --pretrained-model <model_path>