Instance segmentation has witnessed a remarkable progress on class-balanced benchmarks. However, they fail to perform as accurately in real-world scenarios, where the category distribution of objects naturally comes with a long tail. Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories. The overwhelming gradients of negative samples on tail classes lead to a biased learning process for classifiers. Consequently, objects of tail categories are more likely to be misclassified as backgrounds or head categories. To tackle this problem, we propose Seesaw Loss to dynamically re-balance gradients of positive and negative samples for each category, with two complementary factors, i.e., mitigation factor and compensation factor. The mitigation factor reduces punishments to tail categories w.r.t. the ratio of cumulative training instances between different categories. Meanwhile, the compensation factor increases the penalty of misclassified instances to avoid false positives of tail categories. We conduct extensive experiments on Seesaw Loss with mainstream frameworks and different data sampling strategies. With a simple end-to-end training pipeline, Seesaw Loss obtains significant gains over Cross-Entropy Loss, and achieves state-of-the-art performance on LVIS dataset without bells and whistles.
We provide config files to reproduce the instance segmentation performance in the CVPR 2021 paper for Seesaw Loss for Long-Tailed Instance Segmentation.
@inproceedings{wang2021seesaw,
title={Seesaw Loss for Long-Tailed Instance Segmentation},
author={Jiaqi Wang and Wenwei Zhang and Yuhang Zang and Yuhang Cao and Jiangmiao Pang and Tao Gong and Kai Chen and Ziwei Liu and Chen Change Loy and Dahua Lin},
booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
year={2021}
}
-
Please setup LVIS dataset for MMDetection.
-
RFS indicates to use oversample strategy here with oversample threshold
1e-3
.
Method | Backbone | Style | Lr schd | Data Sampler | Norm Mask | box AP | mask AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|
Mask R-CNN | R-50-FPN | pytorch | 2x | random | N | 25.6 | 25.0 | config | model | log |
Mask R-CNN | R-50-FPN | pytorch | 2x | random | Y | 25.6 | 25.4 | config | model | log |
Mask R-CNN | R-101-FPN | pytorch | 2x | random | N | 27.4 | 26.7 | config | model | log |
Mask R-CNN | R-101-FPN | pytorch | 2x | random | Y | 27.2 | 27.3 | config | model | log |
Mask R-CNN | R-50-FPN | pytorch | 2x | RFS | N | 27.6 | 26.4 | config | model | log |
Mask R-CNN | R-50-FPN | pytorch | 2x | RFS | Y | 27.6 | 26.8 | config | model | log |
Mask R-CNN | R-101-FPN | pytorch | 2x | RFS | N | 28.9 | 27.6 | config | model | log |
Mask R-CNN | R-101-FPN | pytorch | 2x | RFS | Y | 28.9 | 28.2 | config | model | log |
Cascade Mask R-CNN | R-101-FPN | pytorch | 2x | random | N | 33.1 | 29.2 | config | model | log |
Cascade Mask R-CNN | R-101-FPN | pytorch | 2x | random | Y | 33.0 | 30.0 | config | model | log |
Cascade Mask R-CNN | R-101-FPN | pytorch | 2x | RFS | N | 30.0 | 29.3 | config | model | log |
Cascade Mask R-CNN | R-101-FPN | pytorch | 2x | RFS | Y | 32.8 | 30.1 | config | model | log |