Our semantic segmentation implementation is based on MMSegmentation v0.19.0 and PVT segmentation. Thank the authors for their wonderful works.
For details see MetaFormer is Actually What You Need for Vision.
Please note that we just simply follow the hyper-parameters of PVT which may not be the optimal ones for PoolFormer. Feel free to tune the hyper-parameters to get better performance.
@article{yu2021metaformer,
title={MetaFormer is Actually What You Need for Vision},
author={Yu, Weihao and Luo, Mi and Zhou, Pan and Si, Chenyang and Zhou, Yichen and Wang, Xinchao and Feng, Jiashi and Yan, Shuicheng},
journal={arXiv preprint arXiv:2111.11418},
year={2021}
}
Install MMSegmentation v0.19.0. Dockerfile_mmdetseg
is the docker file that I use to set up the environment for detection and segmentation. You can also refer to it.
Prepare ADE20K according to the guidelines in MMSegmentation.
Method | Backbone | Pretrain | Iters | mIoU | Config | Download |
---|---|---|---|---|---|---|
Semantic FPN | PoolFormer-S12 | ImageNet-1K | 40K | 37.2 | config | log & model |
Semantic FPN | PoolFormer-S24 | ImageNet-1K | 40K | 40.3 | config | log & model |
Semantic FPN | PoolFormer-S36 | ImageNet-1K | 40K | 42.0 | config | log & model |
Semantic FPN | PoolFormer-M36 | ImageNet-1K | 40K | 42.4 | config | log & model |
Semantic FPN | PoolFormer-M48 | ImageNet-1K | 40K | 42.7 | config | log & model |
All the models can also be downloaded by BaiDu Yun (password: esac).
To evaluate PoolFormer-S12 + Semantic FPN on a single node with 8 GPUs run:
dist_test.sh configs/sem_fpn/PoolFormer/fpn_poolformer_s12_ade20k_40k.py /path/to/checkpoint_file 8 --out results.pkl --eval mIoU
To train PoolFormer-S12 + Semantic FPN on a single node with 8 GPUs run:
dist_train.sh configs/sem_fpn/PoolFormer/fpn_poolformer_s12_ade20k_40k.py 8