-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add explicit dependency checks for normalization details #26
Comments
Thinking more about this, it seems there might be an ambiguity regarding the definitions of Therefore, the |
I agree there's some ambiguity here but I think we can avoid this by making our definitions clear, possibly by introducing another field. I'd like to not just make these fields informational, we use them on our backend for determining what preprocessing to apply. I feel like this metadata is so crucial to being able to reproduce model inference and is so often lost or not made clear that enforcing it in the schema would be valuable (if norm_type is at all specified). I think what we currently call min-max normalization is for the purpose of scaling values to another precision or to the same range. In practice I think it is always used in a relative manner for imagery, though I could be wrong. But in any case it's not really for normalization (adjusting the distribution relative to the population). This line of thinking follows the definitions used by Kaggle https://www.kaggle.com/code/alexisbcook/scaling-and-normalization Maybe we are overloading this norm_type field to define the method for both scaling and normalization which are not always the same. Both may be done, neither may be done, or one or the other may be applied. Should we instead have a |
@fmigneault cloned issue crim-ca/mlm-extension#10 on 2024-05-01:
The text was updated successfully, but these errors were encountered: