-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speed up tile download with concurrent processing #182
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome and glad it will speed things up considerably. A few notes:
- the thread count should be configurable via CLI, let's add it as a flag (not in the config.json since it isn't data related). Note that threads != CPU cores, so you can try like ~50 and see if you get faster downloads (my guess is somewhere between 20-50 is the fastest)
- because we don't use the result of the function anywhere, we don't need to use
.as_completed
or.result
. I'd recommend using shutdown or wait and making sure that the image function itself logs failures
Re: supertile child download: that function will already be happening inside the |
the download time for ~400 tiles with thread count of 50 is ~4 seconds! |
label_maker/images.py
Outdated
image_function(tile, imagery, tiles_dir, kwargs) | ||
t = time.perf_counter() | ||
with concurrent.futures.ThreadPoolExecutor(max_workers=threadcount) as executor: | ||
{executor.submit(image_function, tile, imagery, tiles_dir, kwargs): tile for tile in tiles} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be a list comprehension instead of dict
@martham93 sweet; looks better. Last two things: (1) make the default threads ~10 and then people can up it if they want and (2) either remove the time elapsed or print it with nicer formatting |
awesome thanks @drewbo all of these should be addressed now |
to address issue #12
cc @drewbo