You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your code! I have two question about the code:
First, When calculating the cosine distance, I think it should also be divided by the L2 norms of input_image, but now it seems to be divided only by the L2 norms of support_image.
Second, can you explain in detail the efficient implementation of cross entropy? or what literature I should consult
The text was updated successfully, but these errors were encountered:
Initially I went that way and with the setup I had then the optimization
was more unstable, so I had a look in other optimizations and found that
others did not scale by the target as well. Sometimes having too small a
loss can cause gradient propagation issues. However since then I improved
multiple parts of the system so using both support and target magnitudes
might not be such a bad idea. Have a try if you want and let me know if it
works.
The above is the reply of the author of the paper.
Thanks for your code! I have two question about the code:
First, When calculating the cosine distance, I think it should also be divided by the L2 norms of input_image, but now it seems to be divided only by the L2 norms of support_image.
Second, can you explain in detail the efficient implementation of cross entropy? or what literature I should consult
The text was updated successfully, but these errors were encountered: