-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data partitioning #226
Comments
I think one of the reasons this is taking longer than expected is due to the (still outstanding) Athena bug where summary statistics on a nested float column (our bbox column) return incorrect results. More here: #1 (reply in thread). So the table currently has use of statistics disabled causing longer run times and increased data scanned. @mojodna We should consider just returning the bbox column back to doubles. |
@jwass is there a special reason that there is no S2 or H3 partitioning? |
Is there another way to get around that problem? I currently see no way to use any geospatial indexing with AWS athena, which makes overture useless in scenarios in which you only want to read a small portion of the data. e.g. I spend already hundreds of dollars on athena cost just for loading a couple of hundert building polygons via overture. |
I have the following kind of a query in AWS Athena, which takes about 12-13 seconds to run and over 20GB of data to scan, which is too slow for my use case. I would like to make use of partitioning by a division, for example by a country, but it seems like some rows, in particular in the following location, have division related data completely missing.
Is there any other alternative how I could make the query run faster?
The text was updated successfully, but these errors were encountered: