-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Add Feature] - Throw an error if softmax is used with 1 neuron #18105
Conversation
Thanks for the PR! @Frightera can you confirm that the newly added test would fail without the fix in the library changes? @pradeepkuppla, would it be possible to give this a run across all internal tests before we proceed with an approval? |
Hi @rchao I am not sure if I understood you correctly - do you want me to write a test to fail it on purpose? |
Imported from GitHub PR #18105 This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this. In order to see this in action, please check [the gist](https://colab.research.google.com/gist/Frightera/fdcec020fff6ee9521ae2fd3eaed774d/checksoftmaxlastlayer.ipynb). This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit. Copybara import of the project: -- 90c95b1 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Add last layer activation check for softmax -- 1cedb20 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Split logic for sequential and functional models -- 529f968 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Add tests for _check_last_layer_activation -- d1acddb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update sequential check -- 8363016 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update tests, logic and reformatting -- ebf16c3 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update tests and the logic -- afc156a by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Make validate_softmax_activation experimental Merging this change closes #18105 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18105 from Frightera:last_layer_softmax_warn afc156a PiperOrigin-RevId: 533267082
It is failing internal test with below error, could you please make the necessary changes.
|
@sachinprasadhs Should be fine now, I re-run the tests manually. That was an edge case I didn't think of 😮 Edit: I might try to simplify nested |
To make sure we properly cover the fix, the idea is to have a unit test that would have failed without your library changes. Can you confirm this is the case here? |
@rchao To make it clear - This is not actually a fix, there was no error in Keras. That's just a warning to catch a logical error made by new users. So you really can not catch this with a regular test as the model will be compiled successfully. |
@sachinprasadhs That's interesting, this error was not present in my local CI. Let's try again. |
…euron Imported from GitHub PR #18105 This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this. In order to see this in action, please check [the gist](https://colab.research.google.com/gist/Frightera/fdcec020fff6ee9521ae2fd3eaed774d/checksoftmaxlastlayer.ipynb). This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit. Copybara import of the project: -- 90c95b1 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Add last layer activation check for softmax -- 1cedb20 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Split logic for sequential and functional models -- 529f968 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Add tests for _check_last_layer_activation -- d1acddb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update sequential check -- 8363016 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update tests, logic and reformatting -- ebf16c3 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update tests and the logic -- afc156a by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Make validate_softmax_activation experimental -- 3a228fb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Fix edge case for _validate_softmax_output -- e9c950e by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Check the softmax axis and raise an error if relevant -- 6355b23 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update softmax check tests -- a6745ee by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Minor typo fix -- 92281f6 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Fix test fails for _check_output_activation_softmax -- 72a035f by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Resolve conflict -- 0af059c by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Squashed commit master (merge) to resolve conflicts -- 065cdea by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Revert "Squashed commit master (merge) to resolve conflicts" This reverts commit 0af059c. -- 446f1dd by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Remove steps_per_execution_tuning from imports -- 1fbd931 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Fix lint -- 2c867f8 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>: Update TestCheckLastLayerActivation tests Merging this change closes #18105 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18105 from Frightera:last_layer_softmax_warn 2c867f8 PiperOrigin-RevId: 544094025
@Frightera , Could you please check the below error.
|
Hi @sachinprasadhs, I don't know why this is happening for the second time, I don't get any errors in Colab and my local environment when I run the CI cycle. Also the PR #18264 has now finished its build without any errors. Why is there a contradiction like this? |
Now the previous error is not showing anymore. The internal test result for the above linked PR by copybara is what I'm providing. Here is the new result output.
|
@sachinprasadhs Thanks for the quick response. Maybe I am missing something but PR opened by copybara Also the error you sent:
This part does not exist in my PR: 🤔
This is slightly different from what I have in my PR: f"Output layer {layer_name} has a single unit output, "
"but the activation is softmax. This is most likely an "
"error because softmax outputs sum to 1 therefore single "
"unit outputs with softmax will only output 1.0. If you "
"think that the error is raised due to an incorrect check, "
"please file an issue on "
"https://github.com/keras-team/keras/issues. You can "
"disable this check by setting "
"`experimental_validate_softmax_activation=False` when "
"calling `compile()` on the model."
) So the problem is I can't see the error in the internal tests, and the provided value error is slightly different from what I have in my PR. |
The internal Tests will be run against different hardware configurations, different OS environments, this error seems to be happening in |
So after tracing this I found: keras/keras/integration_test/preprocessing_test_utils.py Lines 88 to 113 in 4829ddf
In line 112, we can see: Which action should we take in this case? Edit: |
@rchao , Can you please suggest in the above comment. |
Hi @rchao, any update on this? A different internal test needs to be changed. |
Hi, any update on this PR by any chance? |
@qlzh727 , Could you please take a look into this PR. Thanks! |
Hello, Thank you for submitting a pull request. We're currently in the process of migrating the new |
This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this.
In order to see this in action, please check the gist.
This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit.