This repository is built based on VARS and VideoChat2.
Our method, VF-VARS, is a model developed for the SoccerNet challenges of the 2024 CVPR CVsports workshop.
❗️ update
- Our model achieved 2nd place out of 17 teams on the official leaderboards.
- We are participating in the SoccerNet paper submission, which will be published soon.
E represents the VideoChat2 encoder module, A denotes the aggregation module, and C is the classification head.
See our technical report for details.
the experiments are conducted in CUDA 11.7
conda create -n snMLV python=3.9
conda activate snMLV
pip install -r requirements.txt
pip install soccernet
Download a pretrained checkpoint file from our drive.
Then place the file in checkpoints/
directory.
python main.py \
--path path/to/dataset \
--model_name your_model_name \
--start_frame 67 \
--end_frame 83 \
--path_to_model_weight path/to/your/checkpoint \
--only_evaluation type \
--multi_gpu
python main.py \
--path path/to/dataset \
--model_name your_model_name \
--start_frame 67 \
--end_frame 83 \
--path_to_model_weight path/to/your/checkpoint \
--model_to_store path/to/store \
--multi_gpu
python main.py \
--path path/to/dataset \
--model_name your_model_name \
--start_frame 67 \
--end_frame 83 \
--model_to_store path/to/store \
--multi_gpu
If you want to train the model from scratch, place the VideoChat2 stage3 weight at videochat2/checkpoints/videochat2/videochat2_mistral_7b_stage3.pth
. It can be downloaded at VideoChat2.
Because of time constraints, we have not been able to train sufficiently in various settings, so our results may not be optimal. I recommend training in various ways.