-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于 use cross-face 训练 #33
Comments
感谢关注。我们的数据pipeline会使用yolo+sam2为视频中每一帧的同一个人打上唯一的id标注,cross face loss只会选取训练帧之外的同一个id标注的参考图像进行loss计算。 |
明白了,感谢您的回复。另外,计算 loss 的时候,我看代码中是这么写的
这里有些不大明白,scheduler.get_velocity 不应该是根据 video_latents 和 noise 获取 target_v 吗?这里的 model_pred 不是 v 吗? |
You can refer to THUDM/CogVideo#403.
|
|
我检查了一下,应该没有吧 |
我看论文中有 “use cross-face (e.g., reference images are sourced from video frames outside the training frames) as inputs with probability β” 这种训练方法,想问一下,如果使用训练帧之外的图像作为参考图像,这就不是同一个 id 了,loss 怎么计算呢?
The text was updated successfully, but these errors were encountered: