-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8117] [SQL] Push codegen implementation into each Expression #6690
Conversation
Code gen code review.
cc @davies I have a big followup pr to move a lot of expression test code for better grouping. |
Test build #34388 has finished for PR 6690 at commit
|
@rxin Should we merge this one first? |
Yes -- except it is failing tests :( |
cc @liancheng why is partitioning suites failing? |
Jenkins, retest this please. |
Test build #34394 has finished for PR 6690 at commit
|
@liancheng The failure is due to double type vs float type under the hood. What I don't get is that how come this code was passing before?! Should we parse 1.5 as double type, or float type? |
I filed #6692 to always use DoubleType. It's much simpler to have only one type for floating point numbers, rather than having to reason about float vs double. |
Test build #34395 has finished for PR 6690 at commit
|
@rxin The test failure was because this PR overrides
|
It's a problem of the test case that manifests from this. We shouldn't use a double value in the place where we intend to have a float value. Let's be more careful with that in the future. |
This PR move codegen implementation of expressions into Expression class itself, make it easy to manage. It introduces two APIs in Expression: ``` def gen(ctx: CodeGenContext): GeneratedExpressionCode def genCode(ctx: CodeGenContext, ev: GeneratedExpressionCode): Code ``` gen(ctx) will call genSource(ctx, ev) to generate Java source code for the current expression. A expression needs to override genSource(). Here are the types: ``` type Term String type Code String /** * Java source for evaluating an [[Expression]] given a [[Row]] of input. */ case class GeneratedExpressionCode(var code: Code, nullTerm: Term, primitiveTerm: Term, objectTerm: Term) /** * A context for codegen, which is used to bookkeeping the expressions those are not supported * by codegen, then they are evaluated directly. The unsupported expression is appended at the * end of `references`, the position of it is kept in the code, used to access and evaluate it. */ class CodeGenContext { /** * Holding all the expressions those do not support codegen, will be evaluated directly. */ val references: Seq[Expression] = new mutable.ArrayBuffer[Expression]() } ``` This is basically apache#6660, but fixed style violation and compilation failure. Author: Davies Liu <davies@databricks.com> Author: Reynold Xin <rxin@databricks.com> Closes apache#6690 from rxin/codegen and squashes the following commits: e1368c2 [Reynold Xin] Fixed tests. 73db80e [Reynold Xin] Fixed compilation failure. 19d6435 [Reynold Xin] Fixed style violation. 9adaeaf [Davies Liu] address comments f42c732 [Davies Liu] improve coverage and tests bad6828 [Davies Liu] address comments e03edaa [Davies Liu] consts fold 86fac2c [Davies Liu] fix style 02262c9 [Davies Liu] address comments b5d3617 [Davies Liu] Merge pull request apache#5 from rxin/codegen 48c454f [Reynold Xin] Some code gen update. 2344bc0 [Davies Liu] fix test 12ff88a [Davies Liu] fix build c5fb514 [Davies Liu] rename 8c6d82d [Davies Liu] update docs b145047 [Davies Liu] fix style e57959d [Davies Liu] add type alias 3ff25f8 [Davies Liu] refactor 593d617 [Davies Liu] pushing codegen into Expression
This PR move codegen implementation of expressions into Expression class itself, make it easy to manage.
It introduces two APIs in Expression:
gen(ctx) will call genSource(ctx, ev) to generate Java source code for the current expression. A expression needs to override genSource().
Here are the types:
This is basically #6660, but fixed style violation and compilation failure.