Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Adaptive Layer #76

Open
minhhotboy9x opened this issue Nov 23, 2023 · 2 comments
Open

About Adaptive Layer #76

minhhotboy9x opened this issue Nov 23, 2023 · 2 comments

Comments

@minhhotboy9x
Copy link

I have some questions about adaptive layers when training KD.

  1. When you combined your KD method with other intermediate feature map KD methods, you had to use adaptive layers to upscale student feature maps. I wonder if these adaptive layers were trained with students, or if you just froze them? I've read a lot of papers and nothing written about this.
  2. These adaptive layers may sometimes distort the output feature map from student and also, they don't contribute to the inference process of student. So why do adaptive layers make KD training work effectively? I think they would make the mAP decrease.
    Can you explain to me, please? Thank you very much.
    image
@HikariTJU
Copy link
Owner

Adaptive layer is used when student feature map and teacher feature map doesn't match.
Many KD papers use FPN as learning target, and FPN layer mostly have the same feature map, thus no adaptive layer (Including ours). That's why we don't mention it

@minhhotboy9x
Copy link
Author

Oh, I see. In my work, I have to use adaptive layers because the number of channels between student and teacher doesn't equal, and I think that makes the mAP of student drop slightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants