Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45905][SQL] Least common type between decimal types should retain integral digits first #43781

Closed
wants to merge 9 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is kind of a followup of #20023 .

It's simply wrong to cut the decimal precision to 38 if a wider decimal type exceeds the max precision, which drops the integral digits and makes the decimal value very likely to overflow.

In #20023 , we fixed this issue for arithmetic operations, but many other operations suffer from the same issue: Union, binary comparison, in subquery, create_array, coalesce, etc.

This PR fixes all the remaining operators, without the min scale limitation, which should be applied to division and multiple only according to the SQL server doc: https://learn.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15

Why are the changes needed?

To produce reasonable wider decimal type.

Does this PR introduce any user-facing change?

Yes, the final data type of these operators will be changed if it's decimal type and its precision exceeds the max and the scale is not 0.

How was this patch tested?

updated tests

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Nov 13, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you re-trigger the failed CI test pipeline, @cloud-fan ?

Also, cc @kazuyukitanimura .

@@ -5425,7 +5434,7 @@ class SQLConf extends Serializable with Logging with SqlApiConf {
}

def legacyRaiseErrorWithoutErrorClass: Boolean =
getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS)
getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noise? (whitespace changes best made in a non-bugfix PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since I touched this file, I just fixed the wrong indentation.

| e1 - e2 | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2) |
| e1 * e2 | p1 + p2 + 1 | s1 + s2 |
| e1 / e2 | p1 - s1 + s2 + max(6, s1 + p2 + 1) | max(6, s1 + p2 + 1) |
| e1 % e2 | min(p1 - s1, p2 - s2) + max(s1, s2) | max(s1, s2) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the arithmetic operations did not strictly follow this rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which one does not follow? The final decimal type can be different as there is one more truncation step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* AND /.
For example:

val a = Decimal(100) // p: 10, s: 0
val b= Decimal(-100) // p: 10, s: 0
val c = a * b  // Decimal(-10000) p: 5, s: 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the Spark SQL multiple. Please take a look at Multiple#resultDecimalType

} else {
// If we have to reduce the precision, we should retain the digits in the integral part first,
// as they are more significant to the value. Here we reduce the scale as well to drop the
// digits in the fractional part.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@cloud-fan
Copy link
Contributor Author

cc @gengliangwang @viirya @wangyum

Copy link
Member

@wangyum wangyum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you take a look at the following failure too? I'm wondering if it's related or not.

[info] - array contains function *** FAILED *** (321 milliseconds)
[info]   Expected exception org.apache.spark.sql.AnalysisException to be thrown, but no exception was thrown (DataFrameFunctionsSuite.scala:1534)

@@ -64,7 +65,11 @@ object DecimalPrecision extends TypeCoercionRule {
def widerDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
val scale = max(s1, s2)
val range = max(p1 - s1, p2 - s2)
DecimalType.bounded(range + scale, scale)
if (conf.getConf(SQLConf.LEGACY_RETAIN_FRACTION_DIGITS_FIRST)) {
DecimalType.bounded(range + scale, scale)
Copy link
Member

@gengliangwang gengliangwang Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many usages of DecimalType.bounded.
Why we only change the behavior here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To limit the scope to type coercion only. Some arithmetic operations also call it to determine the result decimal type and I don't want to change that part.

@@ -4541,6 +4541,15 @@ object SQLConf {
.booleanConf
.createWithDefault(false)

val LEGACY_RETAIN_FRACTION_DIGITS_FIRST =
buildConf("spark.sql.legacy.decimalLeastCommonType.retainFractionDigitsFirst")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH the conf is a bit long.
How about spark.sql.legacy.decimal.retainFractionDigitsOnTruncate. But the naming really depends on the scrope of the behavior change as I asked in https://github.com/apache/spark/pull/43781/files#r1393331222

| e1 / e2 | p1 - s1 + s2 + max(6, s1 + p2 + 1) | max(6, s1 + p2 + 1) |
| e1 % e2 | min(p1 - s1, p2 - s2) + max(s1, s2) | max(s1, s2) |

The truncation rule is also different for arithmetic operations: they retain at least 6 digits in the fractional part, which means we can only reduce `scale` to 6.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention what happens if we cannot truncate fractional part to make it fit into maximum precision? Overflow?

Comment on lines 4547 to 4549
.doc("When set to true, we will try to retain the fraction digits first rather than " +
"integral digits, when getting a least common type between decimal types, and the " +
"result decimal precision exceeds the max precision.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.doc("When set to true, we will try to retain the fraction digits first rather than " +
"integral digits, when getting a least common type between decimal types, and the " +
"result decimal precision exceeds the max precision.")
.doc("When set to true, we will try to retain the fraction digits first rather than " +
"integral digits as prior Spark 4.0, when getting a least common type between decimal types, and the " +
"result decimal precision exceeds the max precision.")

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable. Is there reproducer that can be added as unit test to show the issue in e2e example? Like any operator mentioned in the description?

cloud-fan and others added 3 commits November 15, 2023 11:00
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (only one minor comment about config version).

@cloud-fan
Copy link
Contributor Author

Is there reproducer that can be added as unit test to show the issue in e2e example?

I think the updated tests show the problem.

@cloud-fan
Copy link
Contributor Author

cloud-fan commented Nov 15, 2023

The failure is unrelated

Extension error:
Could not import extension sphinx_copybutton (exception: No module named 'sphinx_copybutton')
make: *** [Makefile:35: html] Error 2

I'm merging it to master, thanks for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants