-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22668][SQL] Ensure no global variables in arguments of method split by CodegenContext.splitExpressions() #19865
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -217,6 +217,18 @@ class CodegenContext { | |
splitExpressions(expressions = initCodes, funcName = "init", arguments = Nil) | ||
} | ||
|
||
/** | ||
* Return true if a given variable has been described as a global variable | ||
*/ | ||
def isDeclaredMutableState(varName: String): Boolean = { | ||
val j = varName.indexOf("[") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we don't need to deal with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. |
||
val qualifiedName = if (j < 0) varName else varName.substring(0, j) | ||
mutableStates.find { s => | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: find -> exists |
||
val i = s._2.indexOf("[") | ||
qualifiedName == (if (i < 0) s._2 else s._2.substring(0, i)) | ||
}.isDefined | ||
} | ||
|
||
/** | ||
* Code statements to initialize states that depend on the partition index. | ||
* An integer `partitionIndex` will be made available within the scope. | ||
|
@@ -789,7 +801,8 @@ class CodegenContext { | |
* @param expressions the codes to evaluate expressions. | ||
* @param funcName the split function name base. | ||
* @param extraArguments the list of (type, name) of the arguments of the split function, | ||
* except for the current inputs like `ctx.INPUT_ROW`. | ||
* except for the current inputs like `ctx.INPUT_ROW`. Name must not be | ||
* mutable state. | ||
* @param returnType the return type of the split function. | ||
* @param makeSplitFunction makes split function body, e.g. add preparation or cleanup. | ||
* @param foldFunctions folds the split function calls. | ||
|
@@ -823,7 +836,8 @@ class CodegenContext { | |
* | ||
* @param expressions the codes to evaluate expressions. | ||
* @param funcName the split function name base. | ||
* @param arguments the list of (type, name) of the arguments of the split function. | ||
* @param arguments the list of (type, name) of the arguments of the split function. Name must | ||
* not be mutable state | ||
* @param returnType the return type of the split function. | ||
* @param makeSplitFunction makes split function body, e.g. add preparation or cleanup. | ||
* @param foldFunctions folds the split function calls. | ||
|
@@ -842,7 +856,10 @@ class CodegenContext { | |
blocks.head | ||
} else { | ||
val func = freshName(funcName) | ||
val argString = arguments.map { case (t, name) => s"$t $name" }.mkString(", ") | ||
val argString = arguments.map { case (t, name) => | ||
assert(!isDeclaredMutableState(name), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm neutral about this pr though, I feel this is not the best place for this assertion cuz this is not only the place where the suggested case happens, e.g., my pr splits aggregate functions in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. I did not know #19082 has own split routine. While it is under review, there is no way to add assertion by this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMHO we'd better to check arguments in all the registered functions via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this problem occurs not only in split method also in normal method. Let us check this in |
||
s"$name in arguments should not be declared as a global variable") | ||
s"$t $name" }.mkString(", ") | ||
val functions = blocks.zipWithIndex.map { case (body, i) => | ||
val name = s"${func}_$i" | ||
val code = s""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's only enable this check in test environment, in case it has bugs and break production jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, this PR uses the method only in
assert
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, even only in
assert
, it still can fail compilation, doesn't it?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about checking Utils.isTesting and throwing an exception in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it is a good way to do this at caller side since this function is valid at debug and production environments.