[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

poplav · 2017-08-14T23:29:31Z

What changes were proposed in this pull request?

This PR is backport of #17087 to Spark 2.1

How was this patch tested?

Add a test suite into DataFrameSuite

…o 64KB bytecode size limit When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.

poplav · 2017-08-15T01:38:18Z

Looking back at this. I simply cherry picked the commit from the branch, there appears to be more to this backport.

gatorsmile · 2017-08-15T06:14:31Z

This might be too risky to be merged to 2.1.1.

kiszk · 2017-08-15T07:39:59Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogUtils.scala

@@ -123,6 +124,38 @@ object ExternalCatalogUtils {
    }
    escapePathName(col) + "=" + partitionString
  }
+
+  def prunePartitionsByFilter(


Is this method used? I think that this PR includes code that are not related to fixing 64KB issue.
I will investigate all of changes later.

It is used by the catalyst/catalog/InMemoryCatalog.scala. I picked this up from cherry picking your commit, which introduced other issues. I may have unnecessarily complicated it. Let me try and remove this and get back to you. Thanks.

Thanks, I changed only seven files in #17087. To update 14 files looks too much.

poplav · 2017-08-15T14:16:44Z

@kiszk , I updated the PR to remove the prunePartionsByFilter bit. Please let me know now.

kiszk · 2017-08-15T18:07:08Z

@poplav it looks good
@gatorsmile Do you think it is ok for backport now? The previous commit included unnecessary changes, too.

gatorsmile · 2017-08-15T18:58:36Z

@poplav This is not a regression from 2.0, right?

Since we might not release 2.1.2, this PR might not be merged to upstream after a discussion with @zsxwing Maybe you can patch it in your private build.

poplav · 2017-08-15T19:28:10Z

Was this working in 2.0 in the first place? I want to get this into 2.1.1

gatorsmile · 2017-08-15T19:32:56Z

You can patch it to your forked version.

prachim-collab · 2017-10-12T06:18:07Z

@poplav were you able to patch this PR and build successfully on top of 2.1.1 ?

poplav · 2017-10-13T01:34:02Z

@pmishr1 Yeah it worked this PR/branch needs one more commit that is at poplav@48ea442. I would update this PR with that commit, but doesn't look like this is going anywhere

AmplabJenkins · 2018-07-28T05:51:38Z

Can one of the admins verify this patch?

dongjoon-hyun · 2018-09-13T17:21:43Z

Hi, @poplav . Unfortunately, this seems to be too old to be merged. Could you close this PR?

dongjoon-hyun · 2018-09-17T19:28:11Z

Gentle ping, @poplav .

dongjoon-hyun · 2018-09-27T17:41:07Z

Ping, @poplav .

poplav · 2018-09-27T17:49:34Z

Was away on vacation. Closing PR, thanks.

dongjoon-hyun · 2018-09-27T18:33:11Z

Thank you, @poplav ! :D

poplav mentioned this pull request Aug 15, 2017

[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #17087

Closed

kiszk reviewed Aug 15, 2017

View reviewed changes

Remove unneeded prunePartionsByFilter from backport

436ef43

poplav force-pushed the SPARK-19372-branch21 branch from 77730fb to 436ef43 Compare August 15, 2017 14:15

poplav closed this Sep 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

poplav commented Aug 14, 2017

poplav commented Aug 15, 2017

gatorsmile commented Aug 15, 2017

kiszk Aug 15, 2017

poplav Aug 15, 2017

kiszk Aug 15, 2017

poplav commented Aug 15, 2017

kiszk commented Aug 15, 2017 •

edited

Loading

gatorsmile commented Aug 15, 2017

poplav commented Aug 15, 2017

gatorsmile commented Aug 15, 2017

prachim-collab commented Oct 12, 2017

poplav commented Oct 13, 2017

AmplabJenkins commented Jul 28, 2018

dongjoon-hyun commented Sep 13, 2018

dongjoon-hyun commented Sep 17, 2018

dongjoon-hyun commented Sep 27, 2018

poplav commented Sep 27, 2018

dongjoon-hyun commented Sep 27, 2018

[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

[BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #18942

Conversation

poplav commented Aug 14, 2017

What changes were proposed in this pull request?

How was this patch tested?

poplav commented Aug 15, 2017

gatorsmile commented Aug 15, 2017

kiszk Aug 15, 2017

Choose a reason for hiding this comment

poplav Aug 15, 2017

Choose a reason for hiding this comment

kiszk Aug 15, 2017

Choose a reason for hiding this comment

poplav commented Aug 15, 2017

kiszk commented Aug 15, 2017 • edited Loading

gatorsmile commented Aug 15, 2017

poplav commented Aug 15, 2017

gatorsmile commented Aug 15, 2017

prachim-collab commented Oct 12, 2017

poplav commented Oct 13, 2017

AmplabJenkins commented Jul 28, 2018

dongjoon-hyun commented Sep 13, 2018

dongjoon-hyun commented Sep 17, 2018

dongjoon-hyun commented Sep 27, 2018

poplav commented Sep 27, 2018

dongjoon-hyun commented Sep 27, 2018

kiszk commented Aug 15, 2017 •

edited

Loading