-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert TIMESTAMP_WITH_TIME_ZONE to primitive type #7480
Conversation
✅ Deploy Preview for meta-velox canceled.
|
2c08673
to
7d2c503
Compare
This change will also break the Presto TimestampTz semantics as mentioned here #2511 (comment) |
I don't quite understand this. The implementation of TimestampTz in this PR is represented by bigint, which contains timestamp and time zone, I think it is consistent with this this implementation in Presto. |
@wypb you are right. I misunderstood Java's TimestampTz specification. |
10d44a7
to
17ad5d7
Compare
Hi @kagamiori @aditi-pandit @mbasmanova Could you also help me review this, thank you. |
Hi @kagamiori Could you also help me review this, thank you. |
Hi @kagamiori @aditi-pandit @mbasmanova Any comments on this PR? |
Hi @mbasmanova @kagamiori @aditi-pandit Could you help me review this, thank you. |
@aditi-pandit Aditi, would you help review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wypb. Did a quick round of review.
return; | ||
} | ||
|
||
auto timestamp = this->toTimestamp(timestampWithTimezone); | ||
auto dateTime = getDateTime(timestamp, nullptr); | ||
adjustDateTime(dateTime, unit); | ||
auto tzID = unpackZoneKeyId(timestampWithTimezone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : Can this local variable be inlined in line 870 ?
@@ -125,4 +124,18 @@ class TimestampWithTimeZoneTypeFactories : public CustomTypeFactories { | |||
|
|||
void registerTimestampWithTimeZoneType(); | |||
|
|||
static constexpr int32_t TIME_ZONE_MASK = 0xFFF; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The velox convention is "Use kPascalCase for static constants and enumerators" So kTimezoneMask kMillisShift is better.
const auto milliseconds = *timestampWithTimezone.template at<0>(); | ||
auto milliseconds = unpackMillisUtc(timestampWithTimezone); | ||
auto tzID = unpackZoneKeyId(timestampWithTimezone); | ||
|
||
Timestamp timestamp = Timestamp::fromMillis(milliseconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the local variables. They are used only once and can be inlined.
auto timezone = | ||
util::getTimeZoneName(*timestampWithTimezone.template at<1>()); | ||
auto tzID = unpackZoneKeyId(timestampWithTimezone); | ||
auto timezone = util::getTimeZoneName(tzID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline tzId variable use.
result.template get_writer_at<0>() = utcTimestamp.getSeconds() * 1000; | ||
result.template get_writer_at<1>() = | ||
*timestampWithTimezone.template at<1>(); | ||
auto milliseconds = unpackMillisUtc(timestampWithTimezone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : Inline local variables used only once.
auto timestamps = BaseVector::create(BIGINT(), size, pool); | ||
auto* rawTimestamps = | ||
timestamps->asFlatVector<int64_t>()->mutableRawValues(); | ||
auto timestamps = AlignedBuffer::allocate<int64_t>(size, pool, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactor.
flatResult.setNull(row, true); | ||
} else { | ||
Timestamp ts = Timestamp::fromMillis(timestampVector->valueAt(row)); | ||
auto timestampWithTimezone = | ||
input.as<SimpleVector<int64_t>>()->valueAt(row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.as<SimpleVector<int64_t>> can be made a local variable outside of the applyToSelectedNoThrow to avoid the casting per row.
applyHashFunction(rows, *vector_.get(), hashes, [&](auto row) { | ||
auto timestampWithTimeZone = vector_->valueAt<int64_t>(row); | ||
auto timestamp = unpackMillisUtc(timestampWithTimeZone); | ||
return hashInteger(timestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timestampWithTimeZone and timestamp can both be inlined.
397af42
to
3eb9510
Compare
Hi @aditi-pandit Thank you for your review, I have modified the corresponding code according to the review comments. If you have time, please help to review it again. Thank you. |
e3796dd
to
c1f7dff
Compare
Hi @kagamiori wei, Thank you for your reply. I'll do it today. |
048e21a
to
1533794
Compare
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
e35564a
to
93819aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
A added a couple of nits and a question.
@@ -135,6 +135,11 @@ FOLLY_ALWAYS_INLINE void PrestoHasher::hash<TypeKind::BIGINT>( | |||
// returns the corresponding value directly. | |||
return vector_->valueAt<int64_t>(row); | |||
}); | |||
} else if (isTimestampWithTimeZoneType(vector_->base()->type())) { | |||
// Hash only timestamp value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is how it was before, but why do we ignore TZ part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @spershin, I believe it was because Presto Java also ignores TZ part and we followed that behavior (https://github.com/prestodb/presto/blob/36575b0ca7d61a0ecf0be8eac7efc0b2b0d3ec85/presto-common/src/main/java/com/facebook/presto/common/type/TimestampWithTimeZoneType.java#L51-L54). I found a relevant discussion here: prestodb/presto#17274.
TimestampWithTimeZoneType() | ||
: RowType({"timestamp", "timezone"}, {BIGINT(), SMALLINT()}) {} | ||
class TimestampWithTimeZoneType : public BigintType { | ||
TimestampWithTimeZoneType() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lint recommends
TimestampWithTimeZoneType() = default;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
static constexpr int32_t kTimezoneMask = 0xFFF; | ||
static constexpr int32_t kMillisShift = 12; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
static should not be used for globals in a header.
For these two constexpr would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
31be7da
to
4e02577
Compare
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Hi @wypb, the change to Base64Test.timestampWithTimezone has just been merged internally. Could you rebase this PR onto the latest main one more time? Thanks! There is a conflict with PrestoSerializer.cpp by the way. |
980eb7b
to
02b512b
Compare
7b961af
to
5caa663
Compare
Hi @kagamiori I have synchronized the latest code, and the relevant UT seems to have passed. PTAL, thank you. |
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
I see that this landed yesterday here 6fdaff7 |
Yeah, not sure why this PR was not auto-closed. I'm closing it now. |
Hi @wypb, are you going to make a follow-up PR of prestodb/presto#21974 to remove the unused code for the old TimestampWithTimezone type and re-enable the unit test? |
@kagamiori I will enable it today. |
…r#9042) Summary: TIMESTAMP_WITH_TIME_ZONE has been converted to a primitive type in facebookincubator#7480. Update the documentation to match. Pull Request resolved: facebookincubator#9042 Reviewed By: Yuhta Differential Revision: D54885988 Pulled By: mbasmanova fbshipit-source-id: 76a3a8cafbaf812ff7573816b57b1d1cde099471
TimestampWithTimeZone is a primitive type in Presto. In contrast,
it is implemented as a Row<int64_t, int16_t> in Velox and hence
treated as a complex type implicitly. This PR convert TIMESTAMP_WITH_TIME_ZONE
to primitive type to achive better consistency between Velox and Presto,
details of the discussion can be found here.
CC: @mbasmanova @aditi-pandit @majetideepak @kagamiori