This is the official repository for the AAAI 2024 paper "Deep Homography Estimation for Visual Place Recognition". [AAAI proceedings] [arXiv] The arXiv version is more complete.
Our another two-stage VPR work SelaVPR achieved SOTA performance on several datasets. The code has been released HERE.
This repo follows the Visual Geo-localization Benchmark. You can refer to it (VPR-datasets-downloader) to prepare datasets and train the CCT-14 backbone (i.e. feature extractor).
The dataset should be organized in a directory tree as such:
├── datasets_vg
└── datasets
└── pitts30k
└── images
├── train
│ ├── database
│ └── queries
├── val
│ ├── database
│ └── queries
└── test
├── database
└── queries
You can directly download the trained CCT-14 backbone:
trained on MSLS: CCT14_msls
trained on Pitts30k: CCT14_pitts30k
After getting the CCT14 backbone trained on MSLS (CCT14_msls.pth), you can train (i.e. initialize) the DHE network on MSLS:
python train_dhe.py --resume_fe=/path/to/your/CCT14_msls.pth --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=msls
You can directly download the initialized DHE network HERE.
To jointly finetune the backbone and the DHE network on the MSLS dataset, please run:
python3 finetune.py --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=msls --epochs_num=2 --resume_fe=/path/to/your/CCT14_msls.pth --resume_hr=/path/to/your/initializedDHE.torch --queries_per_epoch=10000
Finetune on the Pitts30k dataset, please run:
python3 finetune.py --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=pitts30k --epochs_num=40 --resume_fe=/path/to/your/CCT14_pitts30k.pth --resume_hr=/path/to/your/initializedDHE.torch
You can directly download the finetuned CCT14 backbone and DHE network:
MSLS: finetunedCCT14 | finetunedDHE
Pitts30k: finetunedCCT14 | finetunedDHE
To evaluate the finetuned complete DHE-VPR model on MSLS (or Pitts30k), run:
python eval.py --resume_fe=/path/to/your/finetunedCCT14_msls.torch --resume_hr=/path/to/your/finetunedDHE_msls.torch --datasets_folder=/path/to/your/datasets_vg/datasets --dataset_name=msls
Parts of this repo are inspired by the following repositories:
Visual Geo-localization Benchmark
If you find this repo useful for your research, please consider citing the paper
@inproceedings{dhevpr,
title={Deep Homography Estimation for Visual Place Recognition},
author={Lu, Feng and Dong, Shuting and Zhang, Lijun and Liu, Bingxi and Lan, Xiangyuan and Jiang, Dongmei and Yuan, Chun},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2024},
volume={38},
number={9},
pages={10341-10349}
}