-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for double to varchar coercion in hive tables #18930
Add support for double to varchar coercion in hive tables #18930
Conversation
...in/trino-hive/src/test/java/io/trino/plugin/hive/coercions/TestDoubleToVarcharCoercions.java
Outdated
Show resolved
Hide resolved
@@ -395,7 +398,7 @@ protected void applyCoercedValue(BlockBuilder blockBuilder, Block block, int pos | |||
} | |||
} | |||
|
|||
public record CoercionOption(HiveTimestampPrecision timestampPrecision) | |||
public record CoercionOption(HiveTimestampPrecision timestampPrecision, boolean treatNaNAsNull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe CoercionOption
should actually be CoercionOptions
since it will contain multiple options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed it to CoercionContext
WDYT ?
026b920
to
bf8732f
Compare
@atanasenko AC and added some additional tests for infinity. |
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSource.java
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSourceProvider.java
Outdated
Show resolved
Hide resolved
bf8732f
to
5c4b7d4
Compare
@kokosing AC |
5c4b7d4
to
8bee23b
Compare
This abstraction captures all additional information related to coercing column
8bee23b
to
29faa27
Compare
|
||
Slice converted = Slices.utf8Slice(Double.toString(doubleValue)); | ||
if (!toType.isUnbounded() && countCodePoints(converted) > toType.getBoundedLength()) { | ||
throw new TrinoException(INVALID_ARGUMENTS, format("Varchar representation of %s exceeds %s bounds", doubleValue, toType)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call truncateToLength(converted, toType)
without calling countCodePoints(converted) > toType.getBoundedLength()
does that not throw any error ?
It would be nicer to avoid the extra bounds check and have the error come out of truncateToLength
directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call truncateToLength(converted, toType) without calling countCodePoints(converted) > toType.getBoundedLength() does that not throw any error
Yes it doesn't throw any error - scope of truncateToLength
is to trim the length of the Varchar representation - to a narrower precision - so we don't have to fail it explicitly - but here it is an redundant operation as we don't have to truncate it out. Thanks for pointing it out
Double.MIN_VALUE, | ||
Double.MAX_VALUE, | ||
Double.POSITIVE_INFINITY, | ||
Double.parseDouble("123456789.12345678")).collect(toDataProvider()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: collect on next line ?
@@ -191,7 +192,9 @@ public static Optional<ConnectorPageSource> createHivePageSource( | |||
Optional<BucketAdaptation> bucketAdaptation = createBucketAdaptation(bucketConversion, tableBucketNumber, regularAndInterimColumnMappings); | |||
Optional<BucketValidator> bucketValidator = createBucketValidator(path, bucketValidation, tableBucketNumber, regularAndInterimColumnMappings); | |||
|
|||
CoercionContext coercionContext = new CoercionContext(getTimestampPrecision(session)); | |||
// ORC files treats Double.NaN as null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is this an ORC spec behaviour or a Apache Hive behaviour, might want to reword this to clarify
29faa27
to
08c10d9
Compare
08c10d9
to
142e6c9
Compare
Thanks for the review @raunaqmorarka . AC |
Description
Allows coerce a double type to varchar in hive tables.
Additional context and related issues
Overrides #18832
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: