-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove cast of y_true to y_pred data type in sparse categorical cross entropy loss #15015
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the cast is mainly trying to address the dtype mismatch, eg int and float for label and prediction, which is quite common in normal case. The 2*32 seems to be a very extreme case, which we barely hit. Can we change the PR to only cast the dtype if the label and prediction are in different category?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only remove the cast for sparse_categorical_crossentropy(), since the label value could be large based on the dimension of the prediction. The rest of them like binary_crossentropy or categorical_crossentropy, the label value is either one_hot or just 0 and 1, which won't be affected when casting.
Also since backend.sparse_categorical_crossentropy will cast the y_true to int64 anyway, removing the y_true cast here is correct.
Line 4952 in 00518dc
target = cast(target, 'int64') |
keras/losses.py
Outdated
@@ -1536,7 +1536,7 @@ def categorical_hinge(y_true, y_pred): | |||
Categorical hinge loss values. | |||
""" | |||
y_pred = tf.convert_to_tensor(y_pred) | |||
y_true = tf.cast(y_true, y_pred.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should keep this cast here, since the y_true are expect to be either {-1, +1}
or {0, 1}
(i.e. a one-hot-encoded tensor). See the docstring.
We probably want to fix the example in the docstring since the y_true is given in the range of 0-3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah made the changes as requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix.
Casting y_true to y_pred data type gives erroneous loss. Like in case of Mean Squared error if y_true is > 2**32 and y_pred data type is int32 then resulting loss will be wrong.
A detailed notebook on the error which also happens in mean absolute loss, categorical cross entropy loss and sparse categorical cross entropy loss and also contains the proposed changes.
The issue was raised here