You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for sharing this work firstly.
I test this code with a reference code, but I got a results as not I expected. As concerned as the similariy it's far away from InstantID performance.
Furthermore, I feel curious what is the innovation of this work and why not use lora training directly which has turn out much better than embedding training?
The text was updated successfully, but these errors were encountered:
If you tested some images containing bodies directly, you may get poorer results. The input faces in the paper are from the ffhq dataset, all cropped. You could preprocess with FFHQ-Alignment or cut the headshots for your test images.
InstantID is powerful, but lacks the controllability for pose and expressions. Identity embeddings in word embedding space could possess better text controllability.
The identity embeddings learned by our framework (face encoder + AdaIN with celeb space + two phase masked diffusion loss) are more aligned with the celeb name distribution (ideal identity consistency), i.e., more compatible with Stable Diffusion and its plug-and-play modules. Therefore, SD2.1-based video and 3D generation models can be seamlessly combined. In short, our learned embeddings can be used as naturally as celeb names in Stable Diffusion. You could see our inference code.
We think the text embeddings can work with Stable Diffusion more naturally. A lora might not work with plug-and-play modules.
Thanks for sharing this work firstly.
I test this code with a reference code, but I got a results as not I expected. As concerned as the similariy it's far away from InstantID performance.
Furthermore, I feel curious what is the innovation of this work and why not use lora training directly which has turn out much better than embedding training?
The text was updated successfully, but these errors were encountered: