-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support reading INT96 timestamps as LongTimestampWithTimeZoneType #22781
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ship it!
@@ -469,6 +469,32 @@ public void skip(int n) | |||
}; | |||
} | |||
|
|||
public ValueDecoder<int[]> getInt96ToLongTimestampWithTimeZoneDecoder(ParquetEncoding encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add in the description of the PR that the following method can be used as reference while reviewing the PR
trino/lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/ValueDecoders.java
Lines 407 to 434 in eda8fdf
public ValueDecoder<int[]> getInt96ToLongTimestampDecoder(ParquetEncoding encoding, DateTimeZone timeZone) | |
{ | |
checkArgument( | |
field.getType() instanceof TimestampType timestampType && !timestampType.isShort(), | |
"Trino type %s is not a long timestamp", | |
field.getType()); | |
int precision = ((TimestampType) field.getType()).getPrecision(); | |
return new InlineTransformDecoder<>( | |
getInt96TimestampDecoder(encoding), | |
(values, offset, length) -> { | |
for (int i = offset; i < offset + length; i++) { | |
long epochSeconds = decodeFixed12First(values, i); | |
long nanosOfSecond = decodeFixed12Second(values, i); | |
if (timeZone != DateTimeZone.UTC) { | |
epochSeconds = timeZone.convertUTCToLocal(epochSeconds * MILLISECONDS_PER_SECOND) / MILLISECONDS_PER_SECOND; | |
} | |
if (precision < 9) { | |
nanosOfSecond = (int) round(nanosOfSecond, 9 - precision); | |
} | |
// epochMicros | |
encodeFixed12( | |
epochSeconds * MICROSECONDS_PER_SECOND + (nanosOfSecond / NANOSECONDS_PER_MICROSECOND), | |
(int) ((nanosOfSecond * PICOSECONDS_PER_NANOSECOND) % PICOSECONDS_PER_MICROSECOND), | |
values, | |
i); | |
} | |
}); | |
} |
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/ValueDecoders.java
Show resolved
Hide resolved
lib/trino-parquet/src/main/java/io/trino/parquet/reader/decoders/ValueDecoders.java
Outdated
Show resolved
Hide resolved
if (precision < 9) { | ||
epochNanos = round(epochNanos, 9 - precision); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code section tested anywhere?
have .123456789
in the storage , but having MICROSECONDS
or MILLISECONDS
the precision being read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is called when TIMESTAMP_TZ_MICROS is tested, io.trino.parquet.reader.TestingColumnReader#WRITE_INT96
populates nanos which result in rounding to lower precision
plugin/trino-hive/src/test/java/io/trino/plugin/hive/BaseHiveConnectorTest.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/BaseHiveConnectorTest.java
Show resolved
Hide resolved
…Type Hive tables migrated to iceberg through Spark require the ability to read INT96 timestamps as LongTimestampWithTimeZoneType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hive tables migrated to iceberg through Spark require the ability to read INT96 timestamps as LongTimestampWithTimeZoneType
Description
Fixes #11338
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: