-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-30341][SQL] Overflow check for interval arithmetic operations #26995
Conversation
cc: @cloud-fan @maropu @HyukjinKwon, thanks very much for reviewing |
Test build #115685 has finished for PR 26995 at commit
|
Why did you target interval only in this pr? If we change the overflow behaivour for arighmetic operatrions, I think we need to carefully handle it cuz it has impacts on the existing behaivours. |
Because the numeric types are handled in arithmetics properly in both ANSI or non-ANSI mode based on my check. This PR tries to fix the interval to follow what the numeric ones do. Thus, the behavior impacts are limited to the interval operations as the pr description says. |
Test build #115690 has finished for PR 26995 at commit
|
We cannot check it outside |
|
Test build #115692 has finished for PR 26995 at commit
|
@@ -37,6 +37,11 @@ case class UnaryMinus(child: Expression) extends UnaryExpression | |||
with ExpectsInputTypes with NullIntolerant { | |||
private val checkOverflow = SQLConf.get.ansiEnabled | |||
|
|||
override def nullable: Boolean = dataType match { | |||
case CalendarIntervalType if !checkOverflow => true | |||
case _ => super.nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: child.nullable
is more clear here.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
Outdated
Show resolved
Hide resolved
@@ -218,6 +252,11 @@ object BinaryArithmetic { | |||
""") | |||
case class Add(left: Expression, right: Expression) extends BinaryArithmetic { | |||
|
|||
override def nullable: Boolean = dataType match { | |||
case CalendarIntervalType if !checkOverflow => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we make the overflow behavior consistent? e.g. other numeric types follow the java overflow behavior and interval returns null for overflow, which is inconsistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the current behavior of master is separating a) decimal(which is null for overflow) from b) other types. This pr (so far)is just adding intervals to group a). We may reach an agreement on which way to follow first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in spark 2.4, do we have internal arithmetic operations? The non-ANSI behavior should follow the old behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have.
we now have +/-/unaray_-
in java overflow behavior(2.4 or maybe earlier)
and we have '*/-' in null for overflow behavior (3.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then let's be consistent and follow java overflow behavior when ansi is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark-sql> select cast('128' as tinyint);
NULL
Time taken: 0.029 seconds, Fetched 1 row(s)
spark-sql> select cast(128 as tinyint);
-128
not quite related to this pr, the cast logic seems unconsisent too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cast from string and cast from int are different and don't need to be consistent. But interval +- and interval */ should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed behavior to a) ANSI on: exception, b) ANSI off: java style overflow
Test build #115703 has finished for PR 26995 at commit
|
Test build #115735 has finished for PR 26995 at commit
|
Test build #115737 has finished for PR 26995 at commit
|
Test build #115740 has finished for PR 26995 at commit
|
Test build #115746 has finished for PR 26995 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
Show resolved
Hide resolved
Test build #115847 has finished for PR 26995 at commit
|
retest this please |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
Show resolved
Hide resolved
Test build #116013 has finished for PR 26995 at commit
|
@@ -81,7 +80,8 @@ case class Average(child: Expression) extends DeclarativeAggregate with Implicit | |||
case _: DecimalType => | |||
DecimalPrecision.decimalAndDecimal(sum / count.cast(DecimalType.LongDecimal)).cast(resultType) | |||
case CalendarIntervalType => | |||
DivideInterval(sum.cast(resultType), count.cast(DoubleType)) | |||
val newCount = If(EqualTo(count, Literal(0L)), Literal(null, LongType), count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avg(interval)
is also new in 3.0 right? We can also fail here if this is the SQL standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked pgsql, avg on empty table returns null. So this is corrected.
MultiplyInterval(Literal(stringToInterval(interval)), Literal(num)), expected) | ||
} else { | ||
checkEvaluation(MultiplyInterval(Literal(stringToInterval(interval)), Literal(num)), | ||
if (expected == null) null else stringToInterval(expected)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can expected
be null?
withSQLConf(SQLConf.ANSI_ENABLED.key -> v) { | ||
if (checkException) { | ||
checkExceptionInExpression[ArithmeticException]( | ||
MultiplyInterval(Literal(stringToInterval(interval)), Literal(num)), expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can add a val expr = MultiplyInterval(Literal(stringToInterval(interval)), Literal(num))
at beginning, to save code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we can also use safeStringToInterval(expected) == null
to choose way to check too
withSQLConf(SQLConf.ANSI_ENABLED.key -> v) { | ||
if (checkException) { | ||
checkExceptionInExpression[ArithmeticException]( | ||
DivideInterval(Literal(stringToInterval(interval)), Literal(num)), expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Test build #116021 has finished for PR 26995 at commit
|
Test build #116023 has finished for PR 26995 at commit
|
retest this please |
seems multiple PRs start to fail |
@cloud-fan Thanks for pinging me. I'll take a look at these failures. |
Test build #116038 has finished for PR 26995 at commit
|
thanks, merging to master! |
Just FYI I spent my time a bit to deal with Hi @gaborgsomogyi , welcome back! Would you mind to try dealing with this flaky test? Thanks in advance! |
@HeartSaVioR Thanks for pinging. I've had a slight look and seems like it may caused by
My first guess is that the new API behaves somehow different and/or it may contain some bugs. |
Going to speak with the Kafka guys what they think about this... |
Found ticket for zkclient/localhost@EXAMPLE.COM to go to krbtgt/EXAMPLE.COM@EXAMPLE.COM expiring on Fri Jan 03 04:30:09 PST 2020
Service ticket not found in the subject
KrbException: Server not found in Kerberos database (7) - Server not found in Kerberos database
at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:73)
at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:693)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:323)
at org.apache.zookeeper.client.ZooKeeperSaslClient$1.run(ZooKeeperSaslClient.java:320)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:320)
at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:305)
at org.apache.zookeeper.client.ZooKeeperSaslClient.sendSaslPacket(ZooKeeperSaslClient.java:377)
at org.apache.zookeeper.client.ZooKeeperSaslClient.initialize(ZooKeeperSaslClient.java:415)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1149)
Caused by: KrbException: Identifier doesn't match expected value (906)
at sun.security.krb5.internal.KDCRep.init(KDCRep.java:140)
at sun.security.krb5.internal.TGSRep.init(TGSRep.java:65)
at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:60)
at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:55)
... 18 more Hi @gaborgsomogyi you might need look at this exception which is throwed right before the AuthFailedException |
@yaooqinn thanks for sharing. Yep, maybe the kerberos principal is wrong again. Not sure how this can be flaky though since it's hardcoded... The mentioned |
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116249/console here is another flaky failure, maybe related to Jenkins env, e.g. DNS, I'm not sure |
When ANSI is on, interval add/subtract/negative/multiply/divide will overflow if any field overflows. However, when ANSI is off and overflow happens, why multiply/divide will throw an exception but add/subtract/negative will not throw an exception? This looks inconsistent. |
struct<> | ||
-- !query 117 output | ||
java.lang.ArithmeticException | ||
integer overflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change it and make them consistent like what you discussed above #26995 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with your suggestion,but first let us also cc @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a 3.0 blocker JIRA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC a problem is: the new operators in 3.0 (interval * double
and interval / double
) do not have a reasonable java style overflow behavior.
The actual operation is (int * double).toInt
, and Double.PositiveInfinity.toInt
returns Int.Max
, which can be confusing.
Since ansi mode is off by default, we are introducing a new behavior that is non-ansi, which is weird. There are 2 options:
interval * double
andinterval / double
return max or min int/long value when overflow.interval / 0
returns null.- revert these 2 operators.
@gatorsmile what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like the option 2 looks good to me.
Compared with the previous release, any behavior change we made in this PR? |
When ANSI is on, interval add/subtract/negative will check overflow compared with 2.4 |
Since the mode ANSI is introduced in Spark 3.0 and off by default, this is not a behavior change |
What changes were proposed in this pull request?
For the interval arithmetic functions, e.g.
add
/subtract
/negative
/multiply
/divide
, enable overflow check whenANSI
is on.For
multiply
/divide
, throw an exception when an overflow happens in spite ofANSI
is on/off.add
/subtract
/negative
stay the same for backward compatibility.divide
by 0 throws ArithmeticException whetherANSI
or not as same as numerics.These behaviors fit the numeric type operations fully when ANSI is on.
These behaviors fit the numeric type operations fully when ANSI is off, except 2 and 4.
Why are the changes needed?
ANSI
supportDoes this PR introduce any user-facing change?
When
ANSI
is on, intervaladd
/subtract
/negative
/multiply
/divide
will overflow if any field overflowsHow was this patch tested?
add unit tests