Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Add Feature] - Throw an error if softmax is used with 1 neuron #18105

Closed
wants to merge 18 commits into from

Conversation

Frightera
Copy link
Contributor

@Frightera Frightera commented May 8, 2023

This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this.

In order to see this in action, please check the gist.

This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit.

@gbaned gbaned requested a review from qlzh727 May 8, 2023 16:19
@sachinprasadhs sachinprasadhs added the keras-team-review-pending Pending review by a Keras team member. label May 16, 2023
@rchao
Copy link
Contributor

rchao commented May 18, 2023

Thanks for the PR! @Frightera can you confirm that the newly added test would fail without the fix in the library changes?

@pradeepkuppla, would it be possible to give this a run across all internal tests before we proceed with an approval?

@sachinprasadhs sachinprasadhs added the ready to pull Ready to be merged into the codebase label May 18, 2023
@rchao rchao removed ready to pull Ready to be merged into the codebase keras-team-review-pending Pending review by a Keras team member. labels May 18, 2023
@Frightera
Copy link
Contributor Author

Hi @rchao I am not sure if I understood you correctly - do you want me to write a test to fail it on purpose?

@sachinprasadhs sachinprasadhs added the ready to pull Ready to be merged into the codebase label May 18, 2023
copybara-service bot pushed a commit that referenced this pull request May 18, 2023
Imported from GitHub PR #18105

This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this.

In order to see this in action, please check [the gist](https://colab.research.google.com/gist/Frightera/fdcec020fff6ee9521ae2fd3eaed774d/checksoftmaxlastlayer.ipynb).

This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit.
Copybara import of the project:

--
90c95b1 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Add last layer activation check for softmax

--
1cedb20 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Split logic for sequential and functional models

--
529f968 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Add tests for _check_last_layer_activation

--
d1acddb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update sequential check

--
8363016 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update tests, logic and reformatting

--
ebf16c3 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update tests and the logic

--
afc156a by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Make validate_softmax_activation experimental

Merging this change closes #18105

FUTURE_COPYBARA_INTEGRATE_REVIEW=#18105 from Frightera:last_layer_softmax_warn afc156a
PiperOrigin-RevId: 533267082
@sachinprasadhs
Copy link
Collaborator

@Frightera ,

It is failing internal test with below error, could you please make the necessary changes.

    "softmax" in str(output.name.lower())
AttributeError: 'NoneType' object has no attribute 'lower'

@google-ml-butler google-ml-butler bot removed the ready to pull Ready to be merged into the codebase label May 18, 2023
@Frightera
Copy link
Contributor Author

Frightera commented May 18, 2023

@sachinprasadhs Should be fine now, I re-run the tests manually.

That was an edge case I didn't think of 😮

Edit: I might try to simplify nested if statements but that seemed the cleanest way to me.

@rchao
Copy link
Contributor

rchao commented May 19, 2023

Hi @rchao I am not sure if I understood you correctly - do you want me to write a test to fail it on purpose?

To make sure we properly cover the fix, the idea is to have a unit test that would have failed without your library changes. Can you confirm this is the case here?

@Frightera
Copy link
Contributor Author

@rchao To make it clear - This is not actually a fix, there was no error in Keras. That's just a warning to catch a logical error made by new users.

So you really can not catch this with a regular test as the model will be compiled successfully.

@sachinprasadhs sachinprasadhs added ready to pull Ready to be merged into the codebase pending internal tests and removed failing internal tests labels May 19, 2023
@Frightera
Copy link
Contributor Author

@sachinprasadhs That's interesting, this error was not present in my local CI. Let's try again.

@sachinprasadhs sachinprasadhs added ready to pull Ready to be merged into the codebase pending internal tests and removed failing internal tests ready to pull Ready to be merged into the codebase labels Jun 22, 2023
copybara-service bot pushed a commit that referenced this pull request Jun 28, 2023
…euron

Imported from GitHub PR #18105

This is a utility function to check if the usage of softmax makes sense (new users make this mistake a lot). Applying softmax on a single neuron will make the model output ones everytime, there are too many Stackoverflow posts about this.

In order to see this in action, please check [the gist](https://colab.research.google.com/gist/Frightera/fdcec020fff6ee9521ae2fd3eaed774d/checksoftmaxlastlayer.ipynb).

This applies for any other layers (Conv2D etc.) where the applied axis (axis=-1 default) of softmax has only one unit.
Copybara import of the project:

--
90c95b1 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Add last layer activation check for softmax

--
1cedb20 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Split logic for sequential and functional models

--
529f968 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Add tests for _check_last_layer_activation

--
d1acddb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update sequential check

--
8363016 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update tests, logic and reformatting

--
ebf16c3 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update tests and the logic

--
afc156a by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Make validate_softmax_activation experimental

--
3a228fb by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Fix edge case for _validate_softmax_output

--
e9c950e by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Check the softmax axis and raise an error if relevant

--
6355b23 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update softmax check tests

--
a6745ee by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Minor typo fix

--
92281f6 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Fix test fails for _check_output_activation_softmax

--
72a035f by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Resolve conflict

--
0af059c by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Squashed commit master (merge) to resolve conflicts

--
065cdea by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Revert "Squashed commit master (merge) to resolve conflicts"

This reverts commit 0af059c.

--
446f1dd by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Remove steps_per_execution_tuning from imports

--
1fbd931 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Fix lint

--
2c867f8 by Kaan Bıçakcı <46622558+Frightera@users.noreply.github.com>:

Update TestCheckLastLayerActivation tests

Merging this change closes #18105

FUTURE_COPYBARA_INTEGRATE_REVIEW=#18105 from Frightera:last_layer_softmax_warn 2c867f8
PiperOrigin-RevId: 544094025
@sachinprasadhs
Copy link
Collaborator

@Frightera , Could you please check the below error.

Traceback (most recent call last):
  File "/build/work/45a231adaff5e6784bf9c607c668de34477f/google3/runfiles/google3/third_party/py/keras/engine/training_test.py", [line 5110](https://cs.corp.google.com/piper///depot/google3/third_party/py/keras/engine/training_test.py?l=5110&ws=tap-presubmit-server/67803804&snapshot=2), in test_functional_model_output
    with self.assertRaisesRegex(
AssertionError: ValueError not raised

@Frightera
Copy link
Contributor Author

Hi @sachinprasadhs,

I don't know why this is happening for the second time, I don't get any errors in Colab and my local environment when I run the CI cycle.

Also the PR #18264 has now finished its build without any errors. Why is there a contradiction like this?

@sachinprasadhs
Copy link
Collaborator

Now the previous error is not showing anymore.

The internal test result for the above linked PR by copybara is what I'm providing.

Here is the new result output.

File "/build/work/918d6c4f2fd469de06938bb711d72b99a84f/google3/runfiles/google3/third_party/py/keras/engine/training.py", [line 4523](https://cs.corp.google.com/piper///depot/google3/third_party/py/keras/engine/training.py?l=4523&ws=tap-presubmit-server/67837954&snapshot=2), in _check_output_activation_softmax
    raise ValueError(
ValueError: Output layer dense_2 has a single unit output, but the activation is softmax. This is most likely an error because softmax outputs sum to 1 therefore single unit outputs with softmax will only output 1.0. If you think that the error is raised due to an incorrect check, please file an issue on [https://github.com/keras-team/keras/issues](https://www.google.com/url?q=https://github.com/keras-team/keras/issues&sa=D). You can disable this check by setting `experimental_validate_softmax_activation=False` when calling `compile()` on the model.

@Frightera
Copy link
Contributor Author

@sachinprasadhs Thanks for the quick response.

Maybe I am missing something but PR opened by copybara https://github.com/keras-team/keras/pull/18264 #18264 does not include any errors in the internal results.

image

Also the error you sent:

File "/build/work/918d6c4f2fd469de06938bb711d72b99a84f/google3/runfiles/google3/third_party/py/keras/engine/training.py", [line 4523](https://cs.corp.google.com/piper///depot/google3/third_party/py/keras/engine/training.py?l=4523&ws=tap-presubmit-server/67837954&snapshot=2), in _check_output_activation_softmax
    raise ValueError(
ValueError: Output layer dense_2 has a single unit output, but the activation is softmax. This is most likely an error because softmax outputs sum to 1 therefore single unit outputs with softmax will only output 1.0. If you think that the error is raised due to an incorrect check, please file an issue on [https://github.com/keras-team/keras/issues](https://www.google.com/url?q=https://github.com/keras-team/keras/issues&sa=D). You can disable this check by setting `experimental_validate_softmax_activation=False` when calling `compile()` on the model.

This part does not exist in my PR: 🤔

please file an issue on [https://github.com/keras-team/keras/issues](https://www.google.com/url?q=https://github.com/keras-team/keras/issues&sa=D).

This is slightly different from what I have in my PR:

f"Output layer {layer_name} has a single unit output, "
                    "but the activation is softmax. This is most likely an "
                    "error because softmax outputs sum to 1 therefore single "
                    "unit outputs with softmax will only output 1.0. If you "
                    "think that the error is raised due to an incorrect check, "
                    "please file an issue on "
                    "https://github.com/keras-team/keras/issues. You can "
                    "disable this check by setting "
                    "`experimental_validate_softmax_activation=False` when "
                    "calling `compile()` on the model."
                )

So the problem is I can't see the error in the internal tests, and the provided value error is slightly different from what I have in my PR.

@sachinprasadhs
Copy link
Collaborator

The internal Tests will be run against different hardware configurations, different OS environments, this error seems to be happening in //third_party/py/keras/integration_test:dragonfish_preprocessing_applied_in_dataset_creator_test_tpu

@Frightera
Copy link
Contributor Author

Frightera commented Jun 28, 2023

So after tracing this I found:

def make_training_model():
"""Make a trainable model for the preprocessed inputs."""
float_in = tf.keras.Input(shape=(1,), dtype="float32", name="float_col")
# After preprocessing, both the string and int column are integer ready for
# embedding.
int_in = tf.keras.Input(shape=(1,), dtype="int64", name="int_col")
string_in = tf.keras.Input(shape=(1,), dtype="int64", name="string_col")
# Feed the lookup layers into an embedding.
int_embedding = tf.keras.layers.Embedding(VOCAB_SIZE + 1, 8, input_length=1)
int_out = int_embedding(int_in)
int_out = tf.keras.layers.Flatten()(int_out)
string_embedding = tf.keras.layers.Embedding(
VOCAB_SIZE + 1, 8, input_length=1
)
string_out = string_embedding(string_in)
string_out = tf.keras.layers.Flatten()(string_out)
# Concatenate outputs.
concatate = tf.keras.layers.Concatenate()
# Feed our preprocessed inputs into a simple MLP.
x = concatate((float_in, int_out, string_out))
x = tf.keras.layers.Dense(32, activation="relu")(x)
x = tf.keras.layers.Dense(32, activation="relu")(x)
outputs = tf.keras.layers.Dense(1, activation="softmax")(x)
return tf.keras.Model(inputs=(float_in, int_in, string_in), outputs=outputs)

In line 112, we can see: outputs = tf.keras.layers.Dense(1, activation="softmax")(x). So this was triggering my check in the PR and was raising an error. Obviously this is a wrong model setup in the internal test, I can fix the setup in this PR or can open a seperate PR for it.

Which action should we take in this case?

Edit: make_training_model() is called in preprocessing_applied_in_dataset_creator_test

@sachinprasadhs
Copy link
Collaborator

@rchao , Can you please suggest in the above comment.

@sachinprasadhs sachinprasadhs removed the ready to pull Ready to be merged into the codebase label Jul 6, 2023
@Frightera
Copy link
Contributor Author

Hi @rchao, any update on this? A different internal test needs to be changed.

@Frightera
Copy link
Contributor Author

Hi, any update on this PR by any chance?

@sachinprasadhs sachinprasadhs removed the request for review from rchao August 23, 2023 19:05
@sachinprasadhs
Copy link
Collaborator

@qlzh727 , Could you please take a look into this PR. Thanks!

@sachinprasadhs
Copy link
Collaborator

Hello, Thank you for submitting a pull request.

We're currently in the process of migrating the new Keras 3 code base from keras-team/keras-core to keras-team/keras.
Consequently, merging this PR is not possible at the moment. After the migration is successfully completed, feel free to reopen this PR at keras-team/keras if you believe it remains relevant to the Keras 3 code base. If instead this PR fixes a bug or security issue in legacy tf.keras, you can instead reopen the PR at keras-team/tf-keras, which hosts the TensorFlow-only, legacy version of Keras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Closed/Rejected
Development

Successfully merging this pull request may close these issues.

5 participants