Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bucket assignment #249

Merged
merged 2 commits into from
May 7, 2020
Merged

Fix bucket assignment #249

merged 2 commits into from
May 7, 2020

Conversation

sfinkens
Copy link
Member

@sfinkens sfinkens commented Feb 3, 2020

The bucket assignment in the bucket resampler seems to be off by half a pixel. This PR provides a fix.

Example:

from pyresample import create_area_def
from pyresample.bucket import BucketResampler
import dask.array as da
import numpy as np

area = create_area_def(
    area_id='test',
    projection={'proj': 'latlong'},
    center=(0., 0.),
    width=4,
    height=4,
    resolution=10.0
)

lons = da.from_array(np.array([-15, -5, 5, 15]), chunks=2)
lats = da.from_array(np.array([-15, -5, 5, 15]), chunks=2)
data = da.from_array(np.array([1, 2, 3, 4]), chunks=2)
resampler = BucketResampler(source_lats=lats,
                            source_lons=lons,
                            target_area=area)
count = resampler.get_count()

Output with current master:

[[0 0 0 1]
 [0 2 0 0]
 [0 0 0 0]
 [1 0 0 0]]

Output with this PR:

[[0 0 0 1]
 [0 0 1 0]
 [0 1 0 0]
 [1 0 0 0]]

Changes:

  • Compute bucket indices with respect to the edge of the corner
    pixel, not the center.

  • Don't round projection coordinates. This is already done by
    casting to integer.

  • That also means that we don't need the projection coordinates
    anymore. We can use the area extent instead.

  • Tests added

  • Tests passed

  • Passes git diff origin/master **/*py | flake8 --diff

@mraspaud
Copy link
Member

Where are we on this ?

@sfinkens
Copy link
Member Author

@mraspaud This was just a quick hack to show @pnuu my findings. But neither @pnuu nor me really had the time to look into this in more detail. It's on my list and probably next week I will continue working on this.

- Compute bucket indices with respect to the edge of the corner
  pixel, not the center.
- Don't round projection coordinates. This is already done by
  casting to integer.
- That also means that we don't need the projection coordinates
  anymore. We can use the area extent instead.
@sfinkens sfinkens marked this pull request as ready for review April 23, 2020 12:26
@sfinkens sfinkens requested a review from pnuu April 23, 2020 12:26
@sfinkens
Copy link
Member Author

Roughly one week later I finally managed to continue with this 😄

@coveralls
Copy link

coveralls commented Apr 23, 2020

Coverage Status

Coverage increased (+0.03%) to 91.95% when pulling 577a2d1 on sfinkens:bucket-lowres into d8b1b88 on pytroll:master.

@codecov
Copy link

codecov bot commented Apr 23, 2020

Codecov Report

Merging #249 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #249      +/-   ##
==========================================
+ Coverage   91.94%   91.97%   +0.02%     
==========================================
  Files          42       42              
  Lines        8098     8099       +1     
==========================================
+ Hits         7446     7449       +3     
+ Misses        652      650       -2     
Impacted Files Coverage Δ
pyresample/bucket/__init__.py 98.91% <100.00%> (-0.06%) ⬇️
pyresample/test/test_bucket.py 100.00% <100.00%> (ø)
pyresample/geometry.py 82.50% <0.00%> (+0.09%) ⬆️
pyresample/area_config.py 87.74% <0.00%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8b1b88...577a2d1. Read the comment docs.

Observations less than one bucket beyond the left/top boundary
of the target area are assigned to the buckets in the
first row/col. Fix that by flooring projection coordinates
before casting them to integer.
@sfinkens
Copy link
Member Author

I found another little bug: Some observations outside the target area are assigned to the buckets in the first row/col. This happens if the observations are less than one bucket beyond the left/right boundary of the target area. In that case

-1 < (proj_x - adef.area_extent[0]) / x_res < 0
or
-1 < (adef.area_extent[3] - proj_y) / y_res < 0

so that .astype(int) yields 0 instead of -1. Flooring before casting to integer solves the problem.

@sfinkens
Copy link
Member Author

Here is an example where I aggregated some GEO-LEO collocations to a regular grid and found too many observations in the first row and column:
index

@sfinkens sfinkens self-assigned this Apr 24, 2020
@sfinkens sfinkens added the bug label Apr 24, 2020
@mraspaud
Copy link
Member

Nice investigation work! tell us when you want us to review this :)

@sfinkens
Copy link
Member Author

I think it's ready for review now :)

@sfinkens
Copy link
Member Author

sfinkens commented May 4, 2020

@ameraner Since you also started working with the bucket resampler, I'd be pleased if you could review this, too (if you have time)

@ameraner
Copy link
Member

ameraner commented May 5, 2020

Looks good to me, as discussed in Slack :)

@sfinkens
Copy link
Member Author

sfinkens commented May 5, 2020

Thanks for taking the time @ameraner !

Copy link
Member

@pnuu pnuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pnuu pnuu merged commit b7c769b into pytroll:master May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants