- ICCV2023: SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
SPANet is a new backbone network which can handle the balance problem of high- and low-frequency components for optimal feature representations.
Please see image_classification for more details.
Model | Pretrain | Resolution | Top-1 | #Param. | FLOPs |
---|---|---|---|---|---|
SPANet-S | ImageNet-1K | 224x224 | 83.1 | 28.7M | 4.6G |
SPANet-M | ImageNet-1K | 224x224 | 83.5 | 41.8M | 6.8G |
SPANet-MX | ImageNet-1K | 224x224 | 83.8 | 54.9M | 9.0G |
SPANet-B | ImageNet-1K | 224x224 | 84.0 | 75.9M | 12.0G |
SPANet-BX | ImageNet-1K | 224x224 | 84.4 | 99.8 M | 15.8G |
Please see object_detection for more details.
Backbone | Lr Schd | box mAP | #params |
---|---|---|---|
SPANet-S | 1x | 43.3 | 38M |
SPANet-M | 1x | 44.0 | 51M |
Backbone | Lr Schd | box mAP | mask mAP | #params |
---|---|---|---|---|
SPANet-S | 1x | 44.7 | 40.6 | 48M |
SPANet-M | 1x | 45.2 | 41.0 | 61M |
Please see semantic_segmentation for more details.
Backbone | Lr Schd | mIoU | #params | FLOPs |
---|---|---|---|---|
SPANet-S | 80K | 45.4 | 32M | 46G |
SPANet-M | 80K | 46.2 | 45M | 57G |
If you find this repository useful, please give us stars and use the following BibTeX entry for citation.
@inproceedings{yun2023spanet,
title={SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation},
author={Yun, Guhnoo and Yoo, Juhan and Kim, Kijung and Lee, Jeongho and Kim, Dong Hwan},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={6113--6124},
year={2023}
}
This project is released under the MIT license. Please see the LICENSE file for more information.