You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thanks for your nice work, I get confused about sth. Could you please help explain it?
ip adpater doesn't train UNet either, add a cross-attention just as yours, the big difference is you use landmark guide net, and they use clip image to control the structure(IP-Adapter Plus V2), I don't get why your method is better than ip-adapter in text control capabilities. Do you plan to release some training details, e.g. training data?
Thanks for you nice work!!!
The text was updated successfully, but these errors were encountered:
@zhangqizky We also have a certain degree of degradation in text editing capabilities, mainly due to cross-attention. But since we introduced IdentityNet, we can maintain fidelity and set a lower weight for image cross-attention.
Not sure if it's the reason but XL series have better text consistency. Moreover, custom models usually have better text consistency than the base models.
Hello, thanks for your nice work, I get confused about sth. Could you please help explain it?
ip adpater doesn't train UNet either, add a cross-attention just as yours, the big difference is you use landmark guide net, and they use clip image to control the structure(IP-Adapter Plus V2), I don't get why your method is better than ip-adapter in text control capabilities. Do you plan to release some training details, e.g. training data?
Thanks for you nice work!!!
The text was updated successfully, but these errors were encountered: