You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering how you implement the IP2P+, the mask derived from the difference between conditional and unconditional after first_stage_decode or at the latent space before first_stage_decode, could you please figure it out in detail?
The text was updated successfully, but these errors were encountered:
Note that IP2P+ only encodes once (at the begining stage) and decodes once (in the end). Frequent denoising and diffusion operate in latent space. Hence, the mask is derived in the laten space, and it is obtained from the first denoising result.
I am wondering how you implement the IP2P+, the mask derived from the difference between conditional and unconditional after first_stage_decode or at the latent space before first_stage_decode, could you please figure it out in detail?
The text was updated successfully, but these errors were encountered: