-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19372][SQL] Fix throwing a Java exception at df.fliter() due to 64KB bytecode size limit #17087
Conversation
Test build #73530 has finished for PR 17087 at commit
|
@davies, could you please review this? |
Is it right to just do this in one code path? why not all similar cases where the 64k limit is exceeded? |
I already identified where 64k limit is exceeded. As a result, I think that it is not easy to fix the issue. For catching exception, you are right. I have to catch only an exception related to compilation errors. I will address it by handling nested exceptions. I will address it. |
Test build #74019 has finished for PR 17087 at commit
|
I agree with the general approach of having a fallback from code generation to interpreted evaluation, but I also agree that this feels too narrowly targeted. In particular, why do this in one operator rather than in Another thing that maybe @davies can comment on, I thought we already had this fallback implemented? So I'm curious why its not already handling this test case. Maybe there is an existing mechanism we just need to make more general. |
} | ||
} catch { | ||
// JaninoRuntimeException is in a nested exception if Java compilation error occurs | ||
case e: Exception if ExceptionUtils.getRootCause(e).isInstanceOf[JaninoRuntimeException] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than walk the exception tree, should we just make our wrapping more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I will make this wrapping more specific.
@marmbrus thank you for your comments. For feedback mechanism, I imagine that you are talking about this. When the whole-stage codegen is enabled, this fallback works. I agree that we should prepare more generic approach that can be applicable to more cases. However, it is not easy to implement for now. It would require some refactoring. What do you think? FYI: I have just noticed that this problem may occur at CartesianProductExec and InMemoryTableScanExec. I have to do the same thing in these two places. |
I don't think we need a complex refactoring. Why can't |
I am refactoring |
@marmbrus I have just commit the code of intermediate refactoring. Would it be possible to give comments? |
Test build #74262 has finished for PR 17087 at commit
|
There appears to have been some code drift (as Just fix InterpretedPredicate to actually return a |
Thank you for pointing out |
Test build #74310 has finished for PR 17087 at commit
|
Test build #74311 has finished for PR 17087 at commit
|
Test build #74313 has finished for PR 17087 at commit
|
Test build #74374 has finished for PR 17087 at commit
|
@@ -372,7 +374,7 @@ case class WholeStageCodegenExec(child: SparkPlan) extends UnaryExecNode with Co | |||
try { | |||
CodeGenerator.compile(cleanedSource) | |||
} catch { | |||
case e: Exception if !Utils.isTesting && sqlContext.conf.wholeStageFallback => | |||
case e: JaninoRuntimeException if !Utils.isTesting && sqlContext.conf.wholeStageFallback => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeGenerator.doCompile
catches Exception
and re-throw Exception
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can other exceptions be thrown during compiling?
If other exceptions causing compiling failed happens, I think we still need fallback to non wholestage execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, reverted
@@ -951,10 +965,10 @@ object CodeGenerator extends Logging { | |||
evaluator.cook("generated.java", code.body) | |||
recordCompilationStats(evaluator) | |||
} catch { | |||
case e: Exception => | |||
case e: JaninoRuntimeException => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like CompileException
can be thrown from janino?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing out . Now, CompileException
can be caught.
@@ -355,7 +357,21 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ | |||
|
|||
protected def newPredicate( | |||
expression: Expression, inputSchema: Seq[Attribute]): GenPredicate = { | |||
GeneratePredicate.generate(expression, inputSchema) | |||
try { | |||
GeneratePredicate.generate(expression, inputSchema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we only do this fallback if sqlContext.conf.wholeStageFallback
is turned on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good to control it using an option. This is not a part of the whole-stage codegen.
Is it better to use sqlContext.conf.wholeStageFallback
or add sqlContext.conf.codegenFallback
?
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if it makes sense that wholeStageFallback
is false and this new option is true, or vice verse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, now look at sqlContext.conf.wholeStageFallback
.
@@ -32,6 +32,7 @@ import org.apache.spark.sql.types.LongType | |||
import org.apache.spark.util.ThreadUtils | |||
import org.apache.spark.util.random.{BernoulliCellSampler, PoissonSampler} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Extra space line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done
} | ||
logWarning(s"Codegen disabled for this expression:\n $logMessage") | ||
InterpretedPredicate.create(expression, inputSchema) | ||
case e: Exception => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, done
} | ||
|
||
class InterpretedPredicate(expression: Expression) extends GenPredicate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change necessary?
Btw, let InterpretedPredicate
extends GenPredicate
looks weird logically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, this change is necessary to return Predicate
for InterpretedPredicate
.
For GenPredicate
, I used the same naming convention in SparkPlan.scala
. If we use Predicate
, we will have a conflict with others. What name would it be better instead of GenPredicate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BasePredicate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good. Done
Test build #74428 has started for PR 17087 at commit |
Jenkins, retest this please |
Test build #74435 has finished for PR 17087 at commit
|
Test build #74442 has finished for PR 17087 at commit
|
Test build #75999 has finished for PR 17087 at commit
|
@marmbrus could you please take a look? |
ping @marmbrus |
} | ||
|
||
class InterpretedPredicate(expression: Expression) extends BasePredicate { | ||
def eval(r: InternalRow): Boolean = expression.eval(r).asInstanceOf[Boolean] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: override def eval...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done for both.
} | ||
|
||
class InterpretedPredicate(expression: Expression) extends BasePredicate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: case class
// Cache.get() may wrap the original exception. See the following URL | ||
// http://google.github.io/guava/releases/14.0/api/docs/com/google/common/cache/ | ||
// Cache.html#get(K,%20java.util.concurrent.Callable) | ||
case e : UncheckedExecutionException => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the following simple codes:
case e@(_: UncheckedExecutionException | _: ExecutionError) =>
throw e.getCause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, done. thanks.
try { | ||
GeneratePredicate.generate(expression, inputSchema) | ||
} catch { | ||
case e: JaninoRuntimeException if sqlContext == null || sqlContext.conf.wholeStageFallback => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It's better to add a debug log to print e
. you can merge two cases like this:
case e@(_: JaninoRuntimeException | _: CompileException)
if sqlContext == null || sqlContext.conf.wholeStageFallback =>
logDebug(e.getMessage, e)
genInterpretedPredicate(expression, inputSchema)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kiszk could you fix this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zsxwing Sorry, I made mistake. Now, I pushed it.
@@ -353,9 +356,28 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ | |||
GenerateMutableProjection.generate(expressions, inputSchema, useSubexprElimination) | |||
} | |||
|
|||
private def genInterpretedPredicate( | |||
expression: Expression, inputSchema: Seq[Attribute]): InterpretedPredicate = { | |||
val str = expression.toString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Expression.toString will truncate too big expression. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Expression.toString
truncates something. However, it does not work for this case. Thus, I did not change here.
case e: CompileException => | ||
val msg = s"failed to compile: $e\n$formatted" | ||
logError(msg, e) | ||
throw new CompileException(msg, e.asInstanceOf[CompileException].getLocation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use throw new CompileException(msg, e.getLocation, e)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made one pass. This looks good. Most of my comments are style issues.
Test build #76862 has finished for PR 17087 at commit
|
case e @ (_: UncheckedExecutionException | _: ExecutionError) => | ||
val excChains = ExceptionUtils.getThrowables(e) | ||
val exc = if (excChains.length == 1) excChains(0) else excChains(excChains.length - 2) | ||
throw exc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use e.getCause
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done
Test build #76949 has finished for PR 17087 at commit
|
Test build #76972 has finished for PR 17087 at commit
|
LGTM. Thanks! Merging to master. |
…o 64KB bytecode size limit ## What changes were proposed in this pull request? When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. ## How was this patch tested? Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.
…o 64KB bytecode size limit ## What changes were proposed in this pull request? When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. ## How was this patch tested? Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.
Hi, All. |
…o 64KB bytecode size limit ## What changes were proposed in this pull request? When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. ## How was this patch tested? Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes #17087 from kiszk/SPARK-19372.
…o 64KB bytecode size limit When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.
Hi, All. I am trying to get this included into Spark 2.1.1. I opened a PR #18942. |
…o 64KB bytecode size limit When an expression for `df.filter()` has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails. This PR continues to execute by calling `Expression.eval()` disabling code generation if an exception has been caught. Add a test suite into `DataFrameSuite` Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Closes apache#17087 from kiszk/SPARK-19372.
GeneratePredicate.generate(expression, inputSchema) | ||
} catch { | ||
case e @ (_: JaninoRuntimeException | _: CompileException) | ||
if sqlContext == null || sqlContext.conf.wholeStageFallback => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sqlContext.conf.wholeStageFallback
is almost useless here, because almost all the cases this will be done in executors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty risky if we always fallback. This might hide bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me fix it by another PR.
What changes were proposed in this pull request?
When an expression for
df.filter()
has many nodes (e.g. 400), the size of Java bytecode for the generated Java code is more than 64KB. It produces an Java exception. As a result, the execution fails.This PR continues to execute by calling
Expression.eval()
disabling code generation if an exception has been caught.How was this patch tested?
Add a test suite into
DataFrameSuite