You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Data streams let users store time-series data across multiple indices while exposing a single named resource for requests. It is well suited for logs, events, metrics, and other continuously generated data where documents are seldom updated and searches generally target the most recent documents.
The creation of a data stream requires a matching index template containing the mappings and settings used to configure the data stream’s backing indices. The data_stream field indicates that the template creates a data stream instead of a regular index.
Though OpenSearch already has the APIs to interact with data streams (create/read/delete/get stats), the creation of a new data stream is still not possible due to the lack of a metadata field mapper. This mapper is necessary to parse the data_stream field in the index template, and to perform timestamp field validation on the ingested documents.
Without this metadata field mapper, the creation of an index template fails with this error:
We need to create a MetadataFieldMapper to parse the _data_stream_timestamp metadata field mapping used to create data streams. This mapper also overrides the postParse method to ensure that each indexed document has the timestamp field present.
Question: Should the timestamp field name be standardized or made configurable?
A data stream currently only allows "@timestamp" as the timestamp field name for each ingested document. We can remove this restriction and allow users to change the default timestamp field name as required using an index template.
# "@timestamp" will be used as the default timestamp field name.
PUT /_index_template/my-data-stream-template
{
"index_patterns": [ "logs-haproxy", "logs-nginx", "logs-redis" ],
"data_stream": { }
}
# Users can also manually configure the timestamp field name.
PUT /_index_template/my-data-stream-template
{
"index_patterns": [ "logs-haproxy", "logs-nginx", "logs-redis" ],
"data_stream": { "timestamp_field": { "name": "created_at" } }
}
Additional context
As data streams can be queried just like regular indices/aliases, plugins like SQL, PPL, and Asynchronous Search will work seamlessly with data streams. Integration with other OpenSearch plugins such as the following can be improved to further extend the functionality of data streams. These will be tracked as separate issues.
Index Management plugin – An ISM policy can be associated with a data stream to manage the underlying backing indices. These backing indices when rolled over can be moved to a different state, deleted after some time, or rolled up into a summarized index.
Index Management Dashboards plugin – We will update the Index Management user interface to include the ability to view data streams and their underlying backing indices, and assign or edit a policy. Creating index patterns can also be made simpler as the timestamp field is known for a data stream.
Security plugin – Similar to regular indices, user access can be limited to the entire data stream, part of the backing indices of the data stream, as well as at a document or field level.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Data streams let users store time-series data across multiple indices while exposing a single named resource for requests. It is well suited for logs, events, metrics, and other continuously generated data where documents are seldom updated and searches generally target the most recent documents.
The creation of a data stream requires a matching index template containing the mappings and settings used to configure the data stream’s backing indices. The
data_stream
field indicates that the template creates a data stream instead of a regular index.Though OpenSearch already has the APIs to interact with data streams (create/read/delete/get stats), the creation of a new data stream is still not possible due to the lack of a metadata field mapper. This mapper is necessary to parse the
data_stream
field in the index template, and to perform timestamp field validation on the ingested documents.Without this metadata field mapper, the creation of an index template fails with this error:
Describe the solution you'd like
We need to create a
MetadataFieldMapper
to parse the_data_stream_timestamp
metadata field mapping used to create data streams. This mapper also overrides thepostParse
method to ensure that each indexed document has the timestamp field present.Question: Should the timestamp field name be standardized or made configurable?
A data stream currently only allows "@timestamp" as the timestamp field name for each ingested document. We can remove this restriction and allow users to change the default timestamp field name as required using an index template.
Additional context
As data streams can be queried just like regular indices/aliases, plugins like SQL, PPL, and Asynchronous Search will work seamlessly with data streams. Integration with other OpenSearch plugins such as the following can be improved to further extend the functionality of data streams. These will be tracked as separate issues.
The text was updated successfully, but these errors were encountered: