Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

Closed
wants to merge 2 commits into from

Conversation

poplav
Copy link

@poplav poplav commented Aug 14, 2017

What changes were proposed in this pull request?

This PR is backport of #17087 to Spark 2.1

How was this patch tested?

Add a test suite into DataFrameSuite

…o 64KB bytecode size limit

When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails.
This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught.

Add a test suite into `DataFrameSuite`

Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>

Closes apache#17087 from kiszk/SPARK-19372.
@poplav
Copy link
Author

poplav commented Aug 15, 2017

Looking back at this. I simply cherry picked the commit from the branch, there appears to be more to this backport.

@gatorsmile
Copy link
Member

This might be too risky to be merged to 2.1.1.

@@ -123,6 +124,38 @@ object ExternalCatalogUtils {
}
escapePathName(col) + "=" + partitionString
}

def prunePartitionsByFilter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method used? I think that this PR includes code that are not related to fixing 64KB issue.
I will investigate all of changes later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used by the catalyst/catalog/InMemoryCatalog.scala. I picked this up from cherry picking your commit, which introduced other issues. I may have unnecessarily complicated it. Let me try and remove this and get back to you. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I changed only seven files in #17087. To update 14 files looks too much.

@poplav poplav force-pushed the SPARK-19372-branch21 branch from 77730fb to 436ef43 Compare August 15, 2017 14:15
@poplav
Copy link
Author

poplav commented Aug 15, 2017

@kiszk , I updated the PR to remove the prunePartionsByFilter bit. Please let me know now.

@kiszk
Copy link
Member

kiszk commented Aug 15, 2017

@poplav it looks good
@gatorsmile Do you think it is ok for backport now? The previous commit included unnecessary changes, too.

@gatorsmile
Copy link
Member

@poplav This is not a regression from 2.0, right?

Since we might not release 2.1.2, this PR might not be merged to upstream after a discussion with @zsxwing Maybe you can patch it in your private build.

@poplav
Copy link
Author

poplav commented Aug 15, 2017

Was this working in 2.0 in the first place? I want to get this into 2.1.1

@gatorsmile
Copy link
Member

You can patch it to your forked version.

@prachim-collab
Copy link

@poplav were you able to patch this PR and build successfully on top of 2.1.1 ?

@poplav
Copy link
Author

poplav commented Oct 13, 2017

@pmishr1 Yeah it worked this PR/branch needs one more commit that is at poplav@48ea442. I would update this PR with that commit, but doesn't look like this is going anywhere

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dongjoon-hyun
Copy link
Member

Hi, @poplav . Unfortunately, this seems to be too old to be merged. Could you close this PR?

@dongjoon-hyun
Copy link
Member

Gentle ping, @poplav .

@dongjoon-hyun
Copy link
Member

Ping, @poplav .

@poplav
Copy link
Author

poplav commented Sep 27, 2018

Was away on vacation. Closing PR, thanks.

@poplav poplav closed this Sep 27, 2018
@dongjoon-hyun
Copy link
Member

Thank you, @poplav ! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants