Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behaviour of array_interface on mask and nodata setting #299

Closed
rhugonnet opened this issue Sep 5, 2022 · 2 comments · Fixed by #313
Closed

Strange behaviour of array_interface on mask and nodata setting #299

rhugonnet opened this issue Sep 5, 2022 · 2 comments · Fixed by #313
Labels
bug Something isn't working

Comments

@rhugonnet
Copy link
Member

Example below: the nodata has been assigned to 63?!

r.data
Out[8]: 
masked_array(
  data=[[[255, 255, 255, ..., 255, 255, 255],
         [255, 255, 255, ..., 255, 255, 255],
         [255, 255, 255, ..., 255, 255, 255],
         ...,
         [ 74,  76,  79, ..., 121, 119, 141],
         [ 75,  83,  70, ..., 112, 130, 150],
         [ 64,  86,  68, ..., 124, 131, 130]]],
  mask=False,
  fill_value=999999,
  dtype=uint8)
r.data += 5
r.data
Out[10]: 
masked_array(
  data=[[[  4,   4,   4, ...,   4,   4,   4],
         [  4,   4,   4, ...,   4,   4,   4],
         [  4,   4,   4, ...,   4,   4,   4],
         ...,
         [ 79,  81,  84, ..., 126, 124, 146],
         [ 80,  88,  75, ..., 117, 135, 155],
         [ 69,  91,  73, ..., 129, 136, 135]]],
  mask=False,
  fill_value=63,
  dtype=uint8)
np.count_nonzero(r.data==63)
Out[11]: 3441

Looking into it in more details...

@rhugonnet rhugonnet added the bug Something isn't working label Sep 5, 2022
@adehecq
Copy link
Member

adehecq commented Sep 5, 2022

Yes, it's quite common. When not specified, rasterio will set the fill_value attribute of the masked_array as a nodata. See my issue about it here.
The default is I think something like -99999, but when converted to UInt8 (modulo 255), it yields 63.
The best way to overcome this issue is to always set the fill_value and not use the default. But it's a problem for UInt8 when one might not wish to set a nodata (and exclude one value out of 256 only).

@rhugonnet
Copy link
Member Author

rhugonnet commented Sep 15, 2022

In terms of behaviour, I think it should be possible to have a Raster with a masked array that has no invalid data (mask=False) and no fill_value (default 99999). This way we would always mirror the behaviour of np.ma.masked_array.

Here, however, the above behaviour is created by Raster, because np.ma is conserving the original fill_value without modifying it to 63:

test = np.ma.masked_array([0, 255], dtype='uint8')
test
Out[23]: 
masked_array(data=[  0, 255],
             mask=False,
       fill_value=999999,
            dtype=uint8)
test +=5
test
Out[25]: 
masked_array(data=[5, 4],
             mask=False,
       fill_value=999999,
            dtype=uint8)

To solve this, we probably simply need a clause in _overloading_check or data.setter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants