Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18091] [SQL] [BACKPORT-1.6] Deep if expressions cause Generated SpecificUnsafeProjection code to exceed JVM code size limit #16146

Conversation

kapilsingh5050
Copy link

What changes were proposed in this pull request?

Fix for SPARK-18091 which is a bug related to large if expressions causing generated SpecificUnsafeProjection code to exceed JVM code size limit.

This PR changes if expression's code generation to place its predicate, true value and false value expressions' generated code in separate methods in context so as to never generate too long combined code.

How was this patch tested?

Added a unit test and also tested manually with the application (having transformations similar to the unit test) which caused the issue to be identified in the first place.

@cloud-fan
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Dec 6, 2016

Test build #69707 has finished for PR 16146 at commit 8672343.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

hi @kapilsingh5050 can you also include #16244? It does fix the maven tests.

@kapilsingh5050
Copy link
Author

Yes, I'll do that but the test failures here are different. I'm still to figure out the root cause.

Davies Liu and others added 2 commits December 12, 2016 14:48
…with nested wide schema

The wide schema, the expression of fields will be splitted into multiple functions, but the variable for loopVar can't be accessed in splitted functions, this PR change them as class member.

Added regression test.

Author: Davies Liu <davies@databricks.com>

Closes apache#12338 from davies/nested_row.

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
## What changes were proposed in this pull request?

After apache#15620 , all of the Maven-based 2.0 Jenkins jobs time out consistently. As I pointed out in apache#15620 (comment) , it seems that the regression test is an overkill and may hit constants pool size limitation, which is a known issue and hasn't been fixed yet.

Since apache#15620 only fix the code size limitation problem, we can simplify the test to avoid hitting constants pool size limitation.

## How was this patch tested?

test only change

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#16244 from cloud-fan/minor.
@kapilsingh5050
Copy link
Author

ExpressionEncoderSuite and RowEncoderSuite tests were failing because of following fix missing in branch-1.6:
372baf0

@SparkQA
Copy link

SparkQA commented Dec 12, 2016

Test build #70015 has finished for PR 16146 at commit 73f1231.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kapilsingh5050 kapilsingh5050 force-pushed the SPARK-18091-IfCodegenFix-BACKPORT-1.6 branch from 73f1231 to 708c847 Compare December 15, 2016 09:56
@SparkQA
Copy link

SparkQA commented Dec 15, 2016

Test build #70185 has finished for PR 16146 at commit 708c847.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kapilsingh5050
Copy link
Author

kapilsingh5050 commented Dec 15, 2016

@cloud-fan The last test failure is because of following error:
"impossible to get artifacts when data has not been loaded. IvyNode = org.scala-lang#scala-library;2.10.3"
for which this post suggests changing sbt version to 0.13.9 and I also found an open SPARK bug corresponding to this error.
Are you aware of any recent changes in the build configuration for PR builder?
Can we retry the build?

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Dec 15, 2016

Test build #70189 has finished for PR 16146 at commit 708c847.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kapilsingh5050
Copy link
Author

@cloud-fan please review and merge

@rezasafi
Copy link
Contributor

rezasafi commented Feb 7, 2017

Hi, will this pull request merge to branch-1.6?

@zzcclp
Copy link
Contributor

zzcclp commented Feb 15, 2017

ping @cloud-fan , will this pr be merged into branch-1.6?

@maropu maropu mentioned this pull request Apr 23, 2017
maropu added a commit to maropu/spark that referenced this pull request Apr 23, 2017
@asfgit asfgit closed this in e9f9715 Apr 24, 2017
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
This pr proposed to close stale PRs. Currently, we have 400+ open PRs and there are some stale PRs whose JIRA tickets have been already closed and whose JIRA tickets does not exist (also, they seem not to be minor issues).

// Open PRs whose JIRA tickets have been already closed
Closes apache#11785
Closes apache#13027
Closes apache#13614
Closes apache#13761
Closes apache#15197
Closes apache#14006
Closes apache#12576
Closes apache#15447
Closes apache#13259
Closes apache#15616
Closes apache#14473
Closes apache#16638
Closes apache#16146
Closes apache#17269
Closes apache#17313
Closes apache#17418
Closes apache#17485
Closes apache#17551
Closes apache#17463
Closes apache#17625

// Open PRs whose JIRA tickets does not exist and they are not minor issues
Closes apache#10739
Closes apache#15193
Closes apache#15344
Closes apache#14804
Closes apache#16993
Closes apache#17040
Closes apache#15180
Closes apache#17238

N/A

Author: Takeshi Yamamuro <yamamuro@apache.org>

Closes apache#17734 from maropu/resolved_pr.

Change-Id: Id2e590aa7283fe5ac01424d30a40df06da6098b5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants