You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I have been studying your article recently. I noticed that your PPT described pre-train Task 2: region-level as shown in the picture above. But doesn't the actual code input local images into the teacher model.
In addition, I am not quite clear about region-level loss function. Is it to calculate the similarity matrix of local features output by student model and global features of teacher model?
I hope you can answer my two doubts at your convenience
The text was updated successfully, but these errors were encountered:
But doesn't the actual code input local images into the teacher model.
Yes, the actual code implements 2-crop and multi-crop. The latter includes both large crops and small crops (I guess the small crops is the "local images" you mentioned). The slides illustrate the case for 2-crop for simplicity.
Is it to calculate the similarity matrix of local features output by student model and global features of teacher model?
No. The similarity matrix is computed between any two local features (to clarify, it means grid features) between student and teacher.
Hello, I have been studying your article recently. I noticed that your PPT described pre-train Task 2: region-level as shown in the picture above. But doesn't the actual code input local images into the teacher model. In addition, I am not quite clear about region-level loss function. Is it to calculate the similarity matrix of local features output by student model and global features of teacher model? I hope you can answer my two doubts at your convenience
The text was updated successfully, but these errors were encountered: