You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe. #8618 added in a heuristic that speeds up some really large aggregations by sorting the input data in some cases. But sort can be really expensive to do, especially when there is more than one key, or the keys are complex types. #8620 is supposed to look at sort and hopefully come up with a cost model for sort. This is to see if there are every cases where hash re-partitioning the data (similar to what we do for join) would ever be better. As a part of my work for #8618 I wrote a quick one for testing. It was not hard to do, but I could not find very many cases where it improved things (although I only tested with single column keys that were not too complex).
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
#8618 added in a heuristic that speeds up some really large aggregations by sorting the input data in some cases. But sort can be really expensive to do, especially when there is more than one key, or the keys are complex types. #8620 is supposed to look at sort and hopefully come up with a cost model for sort. This is to see if there are every cases where hash re-partitioning the data (similar to what we do for join) would ever be better. As a part of my work for #8618 I wrote a quick one for testing. It was not hard to do, but I could not find very many cases where it improved things (although I only tested with single column keys that were not too complex).
The text was updated successfully, but these errors were encountered: