-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942
Conversation
…o 64KB bytecode size limit When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.
Looking back at this. I simply cherry picked the commit from the branch, there appears to be more to this backport. |
This might be too risky to be merged to 2.1.1. |
@@ -123,6 +124,38 @@ object ExternalCatalogUtils { | |||
} | |||
escapePathName(col) + "=" + partitionString | |||
} | |||
|
|||
def prunePartitionsByFilter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method used? I think that this PR includes code that are not related to fixing 64KB issue.
I will investigate all of changes later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used by the catalyst/catalog/InMemoryCatalog.scala
. I picked this up from cherry picking your commit, which introduced other issues. I may have unnecessarily complicated it. Let me try and remove this and get back to you. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I changed only seven files in #17087. To update 14 files looks too much.
77730fb
to
436ef43
Compare
@kiszk , I updated the PR to remove the |
@poplav it looks good |
Was this working in 2.0 in the first place? I want to get this into 2.1.1 |
You can patch it to your forked version. |
@poplav were you able to patch this PR and build successfully on top of 2.1.1 ? |
@pmishr1 Yeah it worked this PR/branch needs one more commit that is at poplav@48ea442. I would update this PR with that commit, but doesn't look like this is going anywhere |
Can one of the admins verify this patch? |
Hi, @poplav . Unfortunately, this seems to be too old to be merged. Could you close this PR? |
Gentle ping, @poplav . |
Ping, @poplav . |
Was away on vacation. Closing PR, thanks. |
Thank you, @poplav ! :D |
What changes were proposed in this pull request?
This PR is backport of #17087 to Spark 2.1
How was this patch tested?
Add a test suite into DataFrameSuite