Remove redundant distinct over group by #18512

feilong-liu · 2022-10-18T05:02:17Z

What's the change?

Add an optimization which remove distinct if the corresponding output is already distinct after a group by operation.

An example is query "SELECT DISTINCT orderpriority, SUM(totalprice) FROM orders GROUP BY orderpriority", where the distinct operation is redundant.

Test plan - (Please fill in how you tested your changes)

Add unit test.

Benchmark results

Sql query: select distinct orderkey, partkey, suppkey, avg(extendedprice) from lineitem group by orderkey, partkey, suppkey.
Control: 100.324 cpu ms
Test: 71.382 cpu ms

INFO: Without optimization
peak_memory:14040686,elapsed_millis:107,input_rows_per_second:563343,output_rows_per_second:563222,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:106817513,cpu_nanos:106122000,user_nanos:105276000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:121,input_rows_per_second:496354,output_rows_per_second:496247,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:121233823,cpu_nanos:116225000,user_nanos:112490000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:104,input_rows_per_second:577529,output_rows_per_second:577405,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:104193760,cpu_nanos:103269000,user_nanos:102787000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:99,input_rows_per_second:608323,output_rows_per_second:608191,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:98919466,cpu_nanos:98592000,user_nanos:98192000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:103,input_rows_per_second:586097,output_rows_per_second:585970,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:102670618,cpu_nanos:98834000,user_nanos:98230000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:226,input_rows_per_second:265874,output_rows_per_second:265816,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:226328984,cpu_nanos:101054000,user_nanos:100322000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:99,input_rows_per_second:608878,output_rows_per_second:608746,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:98829296,cpu_nanos:97290000,user_nanos:96909000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:97,input_rows_per_second:618267,output_rows_per_second:618134,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:97328389,cpu_nanos:97041000,user_nanos:96738000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:94,input_rows_per_second:639660,output_rows_per_second:639522,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:94073333,cpu_nanos:94008000,user_nanos:93760000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:91,input_rows_per_second:660615,output_rows_per_second:660472,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:91089298,cpu_nanos:90805000,user_nanos:90014000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
remove_redundant_distinct_aggregation ::  100.324 cpu ms :: 13.4MB peak memory :: in 60.2K,      0B,    600K/s,      0B/s :: out 60.2K,  2.07MB,    600K/s,  20.6MB/s

INFO: With optimization
peak_memory:7632508,elapsed_millis:74,input_rows_per_second:812957,output_rows_per_second:812781,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:74019902,cpu_nanos:69991000,user_nanos:69738000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:70,input_rows_per_second:853687,output_rows_per_second:853503,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:70488303,cpu_nanos:69767000,user_nanos:69512000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:70,input_rows_per_second:856021,output_rows_per_second:855836,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:70296139,cpu_nanos:69947000,user_nanos:69751000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:72,input_rows_per_second:839667,output_rows_per_second:839485,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:71665290,cpu_nanos:71003000,user_nanos:70721000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:69,input_rows_per_second:877072,output_rows_per_second:876882,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:68608956,cpu_nanos:68328000,user_nanos:67990000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:67,input_rows_per_second:894244,output_rows_per_second:894050,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:67291472,cpu_nanos:67286000,user_nanos:67140000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:68,input_rows_per_second:887679,output_rows_per_second:887487,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:67789139,cpu_nanos:67608000,user_nanos:67258000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:77,input_rows_per_second:783050,output_rows_per_second:782881,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:76846899,cpu_nanos:76142000,user_nanos:74259000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:85,input_rows_per_second:710566,output_rows_per_second:710412,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:84686002,cpu_nanos:80436000,user_nanos:76732000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:74,input_rows_per_second:815461,output_rows_per_second:815285,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:73792574,cpu_nanos:73309000,user_nanos:72923000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
remove_redundant_distinct_aggregation ::   71.382 cpu ms :: 7.28MB peak memory :: in 60.2K,      0B,    843K/s,      0B/s :: out 60.2K,  2.07MB,    843K/s,  28.9MB/s

== RELEASE NOTES ==

General Changes
* Add an optimization which removes redundant distinct if the output is already distinct after a group by operation.
   The optimization is controlled by session property `remove_redundant_distinct_aggregation` which is default to false.

kaikalur · 2022-10-18T14:48:36Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/FeaturesConfig.java

@@ -238,6 +238,7 @@
    private String nativeExecutionExecutablePath = "./presto_server";
    private boolean randomizeOuterJoinNullKey;
    private boolean isOptimizeConditionalAggregationEnabled;
+    private boolean isRemoveRedundantDistinctAggregationEnabled;


Sest default to true

kaikalur · 2022-10-18T14:51:55Z

What's the change?

Add an optimization which remove distinct if the corresponding output is already distinct after a group by operation.

An example is query "SELECT DISTINCT orderpriority, SUM(totalprice) FROM orders GROUP BY orderpriority", where the distinct operation is redundant.

Test plan - (Please fill in how you tested your changes)

Add unit test.

Benchmark results

Sql query: select distinct orderkey, partkey, suppkey, avg(extendedprice) from lineitem group by orderkey, partkey, suppkey. Control: 100.324 cpu ms Test: 71.382 cpu ms

INFO: Without optimization
peak_memory:14040686,elapsed_millis:107,input_rows_per_second:563343,output_rows_per_second:563222,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:106817513,cpu_nanos:106122000,user_nanos:105276000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:121,input_rows_per_second:496354,output_rows_per_second:496247,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:121233823,cpu_nanos:116225000,user_nanos:112490000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:104,input_rows_per_second:577529,output_rows_per_second:577405,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:104193760,cpu_nanos:103269000,user_nanos:102787000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:99,input_rows_per_second:608323,output_rows_per_second:608191,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:98919466,cpu_nanos:98592000,user_nanos:98192000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:103,input_rows_per_second:586097,output_rows_per_second:585970,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:102670618,cpu_nanos:98834000,user_nanos:98230000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:226,input_rows_per_second:265874,output_rows_per_second:265816,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:226328984,cpu_nanos:101054000,user_nanos:100322000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:99,input_rows_per_second:608878,output_rows_per_second:608746,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:98829296,cpu_nanos:97290000,user_nanos:96909000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:97,input_rows_per_second:618267,output_rows_per_second:618134,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:97328389,cpu_nanos:97041000,user_nanos:96738000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:94,input_rows_per_second:639660,output_rows_per_second:639522,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:94073333,cpu_nanos:94008000,user_nanos:93760000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:14040686,elapsed_millis:91,input_rows_per_second:660615,output_rows_per_second:660472,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:91089298,cpu_nanos:90805000,user_nanos:90014000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
remove_redundant_distinct_aggregation ::  100.324 cpu ms :: 13.4MB peak memory :: in 60.2K,      0B,    600K/s,      0B/s :: out 60.2K,  2.07MB,    600K/s,  20.6MB/s

INFO: With optimization
peak_memory:7632508,elapsed_millis:74,input_rows_per_second:812957,output_rows_per_second:812781,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:74019902,cpu_nanos:69991000,user_nanos:69738000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:70,input_rows_per_second:853687,output_rows_per_second:853503,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:70488303,cpu_nanos:69767000,user_nanos:69512000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:70,input_rows_per_second:856021,output_rows_per_second:855836,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:70296139,cpu_nanos:69947000,user_nanos:69751000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:72,input_rows_per_second:839667,output_rows_per_second:839485,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:71665290,cpu_nanos:71003000,user_nanos:70721000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:69,input_rows_per_second:877072,output_rows_per_second:876882,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:68608956,cpu_nanos:68328000,user_nanos:67990000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:67,input_rows_per_second:894244,output_rows_per_second:894050,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:67291472,cpu_nanos:67286000,user_nanos:67140000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:68,input_rows_per_second:887679,output_rows_per_second:887487,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:67789139,cpu_nanos:67608000,user_nanos:67258000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:77,input_rows_per_second:783050,output_rows_per_second:782881,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:76846899,cpu_nanos:76142000,user_nanos:74259000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:85,input_rows_per_second:710566,output_rows_per_second:710412,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:84686002,cpu_nanos:80436000,user_nanos:76732000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
peak_memory:7632508,elapsed_millis:74,input_rows_per_second:815461,output_rows_per_second:815285,input_megabytes:0,input_megabytes_per_second:0,wall_nanos:73792574,cpu_nanos:73309000,user_nanos:72923000,input_rows:60175,input_bytes:0,output_rows:60162,output_bytes:2165832
remove_redundant_distinct_aggregation ::   71.382 cpu ms :: 7.28MB peak memory :: in 60.2K,      0B,    843K/s,      0B/s :: out 60.2K,  2.07MB,    843K/s,  28.9MB/s

== RELEASE NOTES ==

General Changes
* Add an optimization which removes redundant distinct if the output is already distinct after a group by operation.
   The optimization is controlled by session property `remove_redundant_distinct_aggregation` which is default to false.

Let's make the default to true as this is a safe and general optimization.

ClarenceThreepwood · 2022-10-18T16:19:23Z

Is this functionally different from the existing rule RemoveRedundantDistinct?

kaikalur · 2022-10-18T17:43:40Z

Is this functionally different from the existing rule RemoveRedundantDistinct?

Interesting - I thought that's coming from uniqueness constraints. But this is using the planproperties. If/when we unify these two concepts we can get rid of one of them.

highker

please rebase as well

highker · 2022-10-19T06:36:24Z

presto-main/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java

+                                new PruneRedundantProjectionAssignments(),
+                                new InlineProjections(metadata.getFunctionAndTypeManager()),
+                                new RemoveRedundantIdentityProjections())),


same again... do we really need to run this again? Or we can just merge RemoveRedundantDistinctAggregation into the above rule?

Always keep in mind that running optimizer is costly

Moved this optimizer rule and get rid of these additional projection rules.

highker · 2022-10-19T06:37:07Z

presto-main/src/main/java/com/facebook/presto/sql/analyzer/FeaturesConfig.java

@@ -238,6 +238,7 @@
    private String nativeExecutionExecutablePath = "./presto_server";
    private boolean randomizeOuterJoinNullKey;
    private boolean isOptimizeConditionalAggregationEnabled;
+    private boolean isRemoveRedundantDistinctAggregationEnabled = true;


are we sure we want to enable it by default?

are we sure we want to enable it by default?

Yes this is a very general optimization that should always help and we are being very conservative and also adding more tests for making sure it works for different patterns.

My main worry is that if there is a bug in the code and it would big pain to do fixes in prod. If we have high confidence with full correctness in verifiers, I'm also ok either way.

My main worry is that if there is a bug in the code and it would big pain to do fixes in prod. If we have high confidence with full correctness in verifiers, I'm also ok either way.

Yes - we will do some targeted verifier runs as the pattern is relatively easy to look for in the logs.

Main issue is if we add something turned off we rarely turn it on - e.g optimize_nulls_in_join has been there for ~2 years but we never turned it on.

Yes - we will do some targeted verifier runs as the pattern is relatively easy to look for in the logs.

Yeah, I will run verifier test and report here.

I get about 40 queries which trigger this optimization, with most queries showing about 20% reduction in cpu time.

highker · 2022-10-19T06:37:32Z

...n/java/com/facebook/presto/sql/planner/optimizations/RemoveRedundantDistinctAggregation.java

+        }
+    }
+
+    private class Rewriter


kaikalur · 2022-10-19T17:11:32Z

@feilong-liu like we discussed yesterday, add more tests to the harness:

select distinct x, random() from (.. group by x)
select distinct x from (... group by x, y)
select distinct x+1 as x from (... group by x)
select distinct x from (.. group by x) AS T1 join T2 using(x)
select distinct x from (.. group by x) AS T1 join T2 using(y)

highker · 2022-10-20T16:59:29Z

Can we rebase and push?

simmend · 2022-10-20T21:33:40Z

Excuse me but this problem is already handled in a very general way by #16416. If you there are specific use cases that are not covered by that implementation or if you have difficulty understanding it please shoot me an email at dave@ahana.io and I will be happy to discuss. Please close this PR in the meantime. Thanks.

simmend · 2022-10-20T21:41:51Z

Quick addendum to the last comment. The work of #16416 is disabled by default (it shouldn't be but that is the only way that could get reviewers to approve). If you enable it you will likely see that the existing rule RemoveRedundantDistinct will have already done the job. In fact there is another rule RemoveRedundantAggregateDistinct that will remove distinct specification from aggregate functions if based on preexisting keys or provable max cardinality of 1. I would be thrilled if you would extend this implementation to cover any additional use cases. Again, available to discuss dave@ahana.io

Remove distinct if the corresponding output is already distinct after a group by operation.

kaikalur · 2022-10-21T20:19:13Z

Quick addendum to the last comment. The work of #16416 is disabled by default (it shouldn't be but that is the only way that could get reviewers to approve). If you enable it you will likely see that the existing rule RemoveRedundantDistinct will have already done the job. In fact there is another rule RemoveRedundantAggregateDistinct that will remove distinct specification from aggregate functions if based on preexisting keys or provable max cardinality of 1. I would be thrilled if you would extend this implementation to cover any additional use cases. Again, available to discuss dave@ahana.io

Yes we should definitely look into integrating these properties but we currently don't have bandwidth to test it as the constraints PR is a rather big one that touches a lot of parts (also the reason to merge it disabled by default). So for now, we will get this PR in and look into unifying these things later - tracking issue: #18547

feilong-liu requested a review from a team as a code owner October 18, 2022 05:02

feilong-liu requested a review from presto-oss October 18, 2022 05:02

feilong-liu force-pushed the remove_distinct_over_group_by branch 2 times, most recently from ab20709 to e20e813 Compare October 18, 2022 05:24

feilong-liu requested a review from kaikalur October 18, 2022 05:27

feilong-liu force-pushed the remove_distinct_over_group_by branch from e20e813 to bcb24f9 Compare October 18, 2022 05:56

kaikalur requested changes Oct 18, 2022

View reviewed changes

kaikalur requested a review from rschlussel October 18, 2022 14:49

feilong-liu force-pushed the remove_distinct_over_group_by branch from bcb24f9 to bec5eea Compare October 18, 2022 19:32

kaikalur approved these changes Oct 18, 2022

View reviewed changes

kaikalur requested a review from highker October 18, 2022 21:02

feilong-liu force-pushed the remove_distinct_over_group_by branch from bec5eea to a080ac3 Compare October 18, 2022 21:26

highker approved these changes Oct 19, 2022

View reviewed changes

highker self-assigned this Oct 19, 2022

feilong-liu force-pushed the remove_distinct_over_group_by branch from a080ac3 to 463c909 Compare October 20, 2022 05:31

feilong-liu force-pushed the remove_distinct_over_group_by branch 5 times, most recently from 4fbf590 to 2cc71b9 Compare October 20, 2022 20:35

Remove redundant distinct over group by

bd1aec4

Remove distinct if the corresponding output is already distinct after a group by operation.

feilong-liu force-pushed the remove_distinct_over_group_by branch from 2cc71b9 to bd1aec4 Compare October 21, 2022 18:35

highker removed the request for review from rschlussel October 21, 2022 18:35

highker removed their assignment Oct 21, 2022

kaikalur merged commit 87f543a into prestodb:master Oct 21, 2022

wanglinsong mentioned this pull request Jan 12, 2023

Add release notes for 0.279 #18920

Merged

30 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove redundant distinct over group by #18512

Remove redundant distinct over group by #18512

feilong-liu commented Oct 18, 2022 •

edited

Loading

kaikalur Oct 18, 2022

kaikalur commented Oct 18, 2022

What's the change?

Test plan - (Please fill in how you tested your changes)

Benchmark results

ClarenceThreepwood commented Oct 18, 2022

kaikalur commented Oct 18, 2022

highker left a comment

highker Oct 19, 2022

feilong-liu Oct 20, 2022

highker Oct 19, 2022

kaikalur Oct 19, 2022

highker Oct 19, 2022

kaikalur Oct 19, 2022 •

edited

Loading

feilong-liu Oct 20, 2022

feilong-liu Oct 20, 2022

highker Oct 19, 2022

feilong-liu Oct 20, 2022

kaikalur commented Oct 19, 2022

highker commented Oct 20, 2022

simmend commented Oct 20, 2022

simmend commented Oct 20, 2022

kaikalur commented Oct 21, 2022

Remove redundant distinct over group by #18512

Remove redundant distinct over group by #18512

Conversation

feilong-liu commented Oct 18, 2022 • edited Loading

What's the change?

Test plan - (Please fill in how you tested your changes)

Benchmark results

Choose a reason for hiding this comment

kaikalur commented Oct 18, 2022

What's the change?

Test plan - (Please fill in how you tested your changes)

Benchmark results

ClarenceThreepwood commented Oct 18, 2022

kaikalur commented Oct 18, 2022

highker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaikalur Oct 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaikalur commented Oct 19, 2022

highker commented Oct 20, 2022

simmend commented Oct 20, 2022

simmend commented Oct 20, 2022

kaikalur commented Oct 21, 2022

feilong-liu commented Oct 18, 2022 •

edited

Loading

kaikalur Oct 19, 2022 •

edited

Loading