-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-26664][SQL] Make DecimalType's minimum adjusted scale configurable #23587
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1347,6 +1347,15 @@ object SQLConf { | |
.booleanConf | ||
.createWithDefault(true) | ||
|
||
val DECIMAL_OPERATIONS_MINIMUM_ADJUSTED_SCALE = | ||
buildConf("spark.sql.decimalOperations.minimumAdjustedScale") | ||
.internal() | ||
.doc("Decimal operations' minimum adjusted scale when " + | ||
"spark.sql.decimalOperations.allowPrecisionLoss is true") | ||
.intConf | ||
.checkValue(scale => scale >= 0 && scale < 38, "valid scale should be in [0, 38)") | ||
.createWithDefault(org.apache.spark.sql.types.DecimalType.DEFAULT_MINIMUM_ADJUSTED_SCALE) | ||
|
||
val LITERAL_PICK_MINIMUM_PRECISION = | ||
buildConf("spark.sql.legacy.literal.pickMinimumPrecision") | ||
.internal() | ||
|
@@ -2002,6 +2011,9 @@ class SQLConf extends Serializable with Logging { | |
|
||
def decimalOperationsAllowPrecisionLoss: Boolean = getConf(DECIMAL_OPERATIONS_ALLOW_PREC_LOSS) | ||
|
||
def decimalOperationsMinimumAdjustedScale: Int = | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remember a comment by @rxin about such pattern, which is a bit of overhead/overkill when we use this method only once. So I'd rather revome this and inline it where needed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, let me update it |
||
getConf(DECIMAL_OPERATIONS_MINIMUM_ADJUSTED_SCALE) | ||
|
||
def literalPickMinimumPrecision: Boolean = getConf(LITERAL_PICK_MINIMUM_PRECISION) | ||
|
||
def continuousStreamingExecutorQueueSize: Int = getConf(CONTINUOUS_STREAMING_EXECUTOR_QUEUE_SIZE) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,6 +24,7 @@ import scala.reflect.runtime.universe.typeTag | |
import org.apache.spark.annotation.Stable | ||
import org.apache.spark.sql.AnalysisException | ||
import org.apache.spark.sql.catalyst.expressions.{Expression, Literal} | ||
import org.apache.spark.sql.internal.SQLConf | ||
|
||
/** | ||
* The data type representing `java.math.BigDecimal` values. | ||
|
@@ -117,7 +118,7 @@ object DecimalType extends AbstractDataType { | |
val MAX_SCALE = 38 | ||
val SYSTEM_DEFAULT: DecimalType = DecimalType(MAX_PRECISION, 18) | ||
val USER_DEFAULT: DecimalType = DecimalType(10, 0) | ||
val MINIMUM_ADJUSTED_SCALE = 6 | ||
val DEFAULT_MINIMUM_ADJUSTED_SCALE = 6 | ||
|
||
// The decimal types compatible with other numeric types | ||
private[sql] val BooleanDecimal = DecimalType(1, 0) | ||
|
@@ -153,6 +154,10 @@ object DecimalType extends AbstractDataType { | |
DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE)) | ||
} | ||
|
||
def minimumAdjustedScale: Int = { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I think we can remove the parenthesis... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I went in and updated a few things but all the other functions around this line use braces, I'd like the code style to be consistent at least locally within a file. Is it okay with you if I keep the braces here? |
||
SQLConf.get.decimalOperationsMinimumAdjustedScale | ||
} | ||
|
||
/** | ||
* Scale adjustment implementation is based on Hive's one, which is itself inspired to | ||
* SQLServer's one. In particular, when a result precision is greater than | ||
|
@@ -176,9 +181,9 @@ object DecimalType extends AbstractDataType { | |
} else { | ||
// Precision/scale exceed maximum precision. Result must be adjusted to MAX_PRECISION. | ||
val intDigits = precision - scale | ||
// If original scale is less than MINIMUM_ADJUSTED_SCALE, use original scale value; otherwise | ||
// preserve at least MINIMUM_ADJUSTED_SCALE fractional digits | ||
val minScaleValue = Math.min(scale, MINIMUM_ADJUSTED_SCALE) | ||
// If original scale is less than minimumAdjustedScale, use original scale value; otherwise | ||
// preserve at least minimumAdjustedScale fractional digits | ||
val minScaleValue = Math.min(scale, minimumAdjustedScale) | ||
// The resulting scale is the maximum between what is available without causing a loss of | ||
// digits for the integer part of the decimal and the minimum guaranteed scale, which is | ||
// computed above | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we improve this description explaining clearly what this means? A user may be confused if he/she is not familiar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I can probably paraphrase the code comments in
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
Line 156 in 630e25e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgaido91 how about this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still hacky and it does not resolve the root issue. Let us think about it and see whether we have a better solution. I change the target version to 3.0. We will revisit this before the code freeze of 3.0. If we do not have a solution, we can review this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure which is the problem experienced for which this PR was opened. I think this value can be made a config, as there may be specific use cases where this value may be tuned properly, despite I can't think of specific ones.
What do you mean by "the root issue"? AFAIK, there is only one main issue with decimal operations rigth now, which is a division when a decimal with negative scale is involved. For this problem, #22450 is waiting for reviews, after positive feedback on that approach in the discussion in the mailing list.