Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NaNs in datasets #326

Merged
merged 2 commits into from
Feb 7, 2025
Merged

Allow NaNs in datasets #326

merged 2 commits into from
Feb 7, 2025

Conversation

devsjc
Copy link
Contributor

@devsjc devsjc commented Feb 7, 2025

In the GCP archive, the bits of the image which are off the earth disk are filled with NaNs. There are even times when some random scattered pixels over the earth are NaNs, (probably just a measurements error from the satellite). In the production data, there are no NaNs and the pieces of the image off the earth disk are all zeros.

This is an issue for the cloudcasting model. When training, all the pixels that are NaN are flagged to the model so it knows where they are. This isn't possible in production since all NaNs are now zero, but not all the zero values were NaN - at night the visual channels go dark, for instance.

This PR updates the type of the dataset to float32, which implements NaN.

@devsjc devsjc requested a review from dfulu February 7, 2025 10:57
Copy link
Member

@dfulu dfulu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@devsjc devsjc merged commit 485a810 into main Feb 7, 2025
2 of 4 checks passed
@devsjc devsjc deleted the devsjc/cloudcasting-nans branch February 7, 2025 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants