Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21603][SQL][FOLLOW-UP] Change the default value of maxLinesPerFunction into 4000 #19021

Closed
wants to merge 1 commit into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Aug 22, 2017

What changes were proposed in this pull request?

This pr changed the default value of maxLinesPerFunction into 4000. In #18810, we had this new option to disable code generation for too long functions and I found this option only affected Q17 and Q66 in TPC-DS. But, Q66 had some performance regression:

Q17 w/o #18810, 3224ms --> q17 w/#18810, 2627ms (improvement)
Q66 w/o #18810, 1712ms --> q66 w/#18810, 3032ms (regression)

To keep the previous performance in TPC-DS, we better set higher value at maxLinesPerFunction by default.

How was this patch tested?

Existing tests.

@maropu
Copy link
Member Author

maropu commented Aug 22, 2017

I'll check performance changes in TPC-DS.

@maropu
Copy link
Member Author

maropu commented Aug 22, 2017

Got the numbers and see them here. The numbers seems to be the almost same with the master just before #18810 merged. @gatorsmile @viirya

@maropu
Copy link
Member Author

maropu commented Aug 22, 2017

Also, I talked with @viirya here and -1 seems to be more natural to disable this too-long-function option. If we don't strongly disagree with this, I could make another trivial pr to fix, too. @gatorsmile

@SparkQA
Copy link

SparkQA commented Aug 22, 2017

Test build #80976 has finished for PR 19021 at commit d0bc6a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

.intConf
.createWithDefault(2667)
Copy link
Member

@gatorsmile gatorsmile Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also try 2731 and 2049? and see the perf difference?

8K = 8192 
8192 / 3 + 1 = 2731
8192 / 2 + 1 = 4097

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already checked 2800; this value activated too-long-function optimization in Q17/Q66 and Q66 had regression. So, I think 2731 and 2049 possibly have the same regression (I've also checked 2900 that disabled this in Q17/Q66).

btw, why the computation is based on 8K? It seems the threshold of DontCompileHugeMethods is 8000?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES.... Just double check it. The default of HugeMethodLimitis 8000... Thanks!

@gatorsmile
Copy link
Member

LGTM

@gatorsmile
Copy link
Member

cc @rednaxelafx

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 6942aee Aug 23, 2017
@maropu
Copy link
Member Author

maropu commented Aug 24, 2017

Just for your info, again, I looked into this issue in TPC-DS quries; I added some code to check the actual bytecode size of these quries and I found the gen'd function in Q17/Q66 only had too-long bytecode over 8000:

===== TPCDS QUERY BENCHMARK OUTPUT FOR q17 =====
17/08/23 14:45:02 WARN CodeGenerator: GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on HotSpot; the size of agg_doAggregateWithKeys is 17665; the limit is 8000

===== TPCDS QUERY BENCHMARK OUTPUT FOR q66 =====
17/08/23 14:55:39 WARN CodeGenerator: GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on HotSpot; the size of agg_doAggregateWithKeys is 11012; the limit is 8000
17/08/23 14:55:39 WARN CodeGenerator: GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on HotSpot; the size of agg_doAggregateWithKeys is 13420; the limit is 8000
17/08/23 14:55:39 WARN CodeGenerator: GeneratedClass.agg_doAggregateWithKeys is too large to do JIT compilation on HotSpot; the size of agg_doAggregateWithKeys is 16641; the limit is 8000

BTW, why we don't check if gen'd bytecode size is over 8000 directly instead of code line num. in #18810? cc: @gatorsmile @viirya @kiszk

@viirya
Copy link
Member

viirya commented Aug 24, 2017

@maropu Should be a good idea. Especially the number of lines of code may not be intuitive to set for this purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants