About Adaptive Layer #76

minhhotboy9x · 2023-11-23T03:16:10Z

I have some questions about adaptive layers when training KD.

When you combined your KD method with other intermediate feature map KD methods, you had to use adaptive layers to upscale student feature maps. I wonder if these adaptive layers were trained with students, or if you just froze them? I've read a lot of papers and nothing written about this.
These adaptive layers may sometimes distort the output feature map from student and also, they don't contribute to the inference process of student. So why do adaptive layers make KD training work effectively? I think they would make the mAP decrease.
Can you explain to me, please? Thank you very much.

HikariTJU · 2023-11-23T07:55:22Z

Adaptive layer is used when student feature map and teacher feature map doesn't match.
Many KD papers use FPN as learning target, and FPN layer mostly have the same feature map, thus no adaptive layer (Including ours). That's why we don't mention it

minhhotboy9x · 2023-11-23T13:18:01Z

Oh, I see. In my work, I have to use adaptive layers because the number of channels between student and teacher doesn't equal, and I think that makes the mAP of student drop slightly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Adaptive Layer #76

About Adaptive Layer #76

minhhotboy9x commented Nov 23, 2023

HikariTJU commented Nov 23, 2023

minhhotboy9x commented Nov 23, 2023

About Adaptive Layer #76

About Adaptive Layer #76

Comments

minhhotboy9x commented Nov 23, 2023

HikariTJU commented Nov 23, 2023

minhhotboy9x commented Nov 23, 2023