-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSDB: Add _tsid
field to time_series indices
#80276
Conversation
Pinging @elastic/es-analytics-geo (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked on video a bit but I had already started typing stuff. This makes public what we talked about.
server/src/main/java/org/elasticsearch/index/IndexSettings.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/IpFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/MapperService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/Mapping.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/Mapping.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
Will index sorting settings about dealing _tsid added in this PR? Or create another PR do the feature? |
@weizijun this PR is work in progress. I need to add more tests and complete the index sorting part before it is ready to be merged. |
so that they are added explicitly in the field mapper
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
to IndexMode so that it is only created in time_series mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized a thing about encoding these even if we're not using them. I think it's rare that a field will be a dimension outside of time series mode but it's possible. We should probably come up with a scheme to skip those fields....
qa/rolling-upgrade/src/test/java/org/elasticsearch/upgrades/IndexingIT.java
Show resolved
Hide resolved
new FieldSortSpec(TimeSeriesIdFieldMapper.NAME), | ||
new FieldSortSpec(DataStreamTimestampFieldMapper.DEFAULT_PATH) }; | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably use the same "move it into a method on IndexMode" treatment that the builder used. But that can wait for a follow up I think.
if (dimension) { | ||
// Extract the tsid part of the dimension field | ||
BytesReference bytes = TimeSeriesIdFieldMapper.extractTsidValue(NetworkAddress.format(address)); | ||
context.doc().addDimensionBytes(fieldType().name(), bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about.... Can't you declare something a dimension without enabling the tsid field? I wonder if we should avoid encoding it if the field isn't present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ip
and numeric field types, we can skip the encodeTsidValue
part if _tsid
field is not going to be generated.
However, for keyword
fields we still must do the encoding, because must validate the string length.
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java
Outdated
Show resolved
Hide resolved
Good call!
…On Thu, Nov 25, 2021, 10:21 AM Christos Soulios ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In server/src/main/java/org/elasticsearch/index/mapper/IpFieldMapper.java
<#80276 (comment)>
:
> @@ -497,18 +499,14 @@ private static InetAddress value(XContentParser parser, InetAddress nullValue) t
}
private void indexValue(DocumentParserContext context, InetAddress address) {
+ if (dimension) {
+ // Extract the tsid part of the dimension field
+ BytesReference bytes = TimeSeriesIdFieldMapper.extractTsidValue(NetworkAddress.format(address));
+ context.doc().addDimensionBytes(fieldType().name(), bytes);
For ip and numeric field types, we can skip the encodeTsidValue part if
_tsid field is not going to be generated.
However, for keyword fields we still must do the encoding, because must
validate the string length.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#80276 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIUOG6MA3K3XFT6H7P3UNZIANANCNFSM5HI45LVA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@elasticmachine run elasticsearch-ci/eql-correctness |
@elasticmachine run elasticsearch-ci/part-2 |
This PR builds on the work added in #80276 that generates the _tsid field for keyword, ip and number dimension fields. It adds support for unsigned_long dimension fields.
Since _tsid cannot be a multi-value field, this PR modifies the TimeSeriesIdFieldMapper so that _tsid is added as a SortedDocValuesField (instead of a SortedSetDocValuesField) Relates to #80276
This PR adds support for a field named
_tsid
that uniquely identifies the time series a document belongs to.When a document is indexed in a time series index (
IndexMode.TIME_SERIES
),_tsid
field is generated from the values of all dimension fields.