Skip to content

Latest commit

 

History

History
85 lines (58 loc) · 3.34 KB

README.md

File metadata and controls

85 lines (58 loc) · 3.34 KB

SPANet Official (ongoing)

💬 This repo is the official implementation of:

  • ICCV2023: SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation

🤖 It currently includes code and models for the following tasks:

📖 Introduction

SPANet is a new backbone network which can handle the balance problem of high- and low-frequency components for optimal feature representations.

Main results on ImageNet-1K

Please see image_classification for more details.

Model Pretrain Resolution Top-1 #Param. FLOPs
SPANet-S ImageNet-1K 224x224 83.1 28.7M 4.6G
SPANet-M ImageNet-1K 224x224 83.5 41.8M 6.8G
SPANet-MX ImageNet-1K 224x224 83.8 54.9M 9.0G
SPANet-B ImageNet-1K 224x224 84.0 75.9M 12.0G
SPANet-BX ImageNet-1K 224x224 84.4 99.8 M 15.8G

Main results on COCO object detection and instance segmentation

Please see object_detection for more details.

RetinaNet 1x

Backbone Lr Schd box mAP #params
SPANet-S 1x 43.3 38M
SPANet-M 1x 44.0 51M

Mask R-CNN 1x

Backbone Lr Schd box mAP mask mAP #params
SPANet-S 1x 44.7 40.6 48M
SPANet-M 1x 45.2 41.0 61M

Main results on ADE20K semantice segmentation

Please see semantic_segmentation for more details.

Semantic FPN

Backbone Lr Schd mIoU #params FLOPs
SPANet-S 80K 45.4 32M 46G
SPANet-M 80K 46.2 45M 57G

⭐ Cite SPANet

If you find this repository useful, please give us stars and use the following BibTeX entry for citation.

@inproceedings{yun2023spanet,
  title={SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation},
  author={Yun, Guhnoo and Yoo, Juhan and Kim, Kijung and Lee, Jeongho and Kim, Dong Hwan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6113--6124},
  year={2023}
}

License

This project is released under the MIT license. Please see the LICENSE file for more information.