Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: store well defined metrics as times-series data streams #9649

Open
axw opened this issue Nov 23, 2022 · 7 comments
Open

PoC: store well defined metrics as times-series data streams #9649

axw opened this issue Nov 23, 2022 · 7 comments
Labels

Comments

@axw
Copy link
Member

axw commented Nov 23, 2022

In recent versions, Elasticsearch has introduced time-series data streams (TSDS) -- a type of data stream that is well suited to storing (and querying) metrics. TSDS reduces disk space usage, and in the future it is expected to provide improved metric aggregations functionality. TSDS enables downsampling (rollup) of metrics, a feature that would enable our users to trade fidelity for cost, to maintain precision of metrics over a longer period for a reasonable cost.

Let's investigate changing the internal metrics data stream to use index_mode: time_series. Metrics will be identified and marked with the time_series_metric attribute. Metric dimensions (e.g. service.name) will be identified and marked with the time_series_dimension attribute.

We should investigate whether we can switch over to TSDS without affecting the UI, or if additional changes are required.

We should use Rally to identify any storage savings (or unexpected costs), ingest throughput degradation, and ideally query performance improvements.

@kruskall
Copy link
Member

This is currently blocked by elastic/kibana#146804

@simitt
Copy link
Contributor

simitt commented Dec 20, 2022

@kruskall and I discussed yesterday to manually update the ES index template accordingly for continuing to test any performance and UI implications. Also to further look into relevant metric dimensions.

@simitt
Copy link
Contributor

simitt commented Jan 16, 2023

@kruskall could you investigate and add a summary related to

We should investigate whether we can switch over to TSDS without affecting the UI, or if additional changes are required.

We should use Rally to identify any storage savings (or unexpected costs), ingest throughput degradation, and ideally query performance improvements.

We can then decide how to move forward with the PR #9730

@simitt
Copy link
Contributor

simitt commented Feb 8, 2023

related elastic/elasticsearch#93564

@kruskall
Copy link
Member

kruskall commented Feb 8, 2023

adding more informations as most of the conversation happened in other channels:

I've opened a separate issue for the rally issue: #10206

All the kibana blockers have been solved and the PR was updated to use most of the fields of transaction metrics as dimensions. The total dimensions was around 30 and we bumped into some issues: there is a hard limit of 16 dimensions.

16 is quite limiting and even accounting for fields that provide redundant informations we had to make some sacrifices (e3f691b). I don't think we can move to time-series with that number of dimenions.

@simitt
Copy link
Contributor

simitt commented Feb 8, 2023

Moving this task to backlog and removing the milestone. We can re-investigate when the ES issue with the dimension limit (elastic/elasticsearch#93564) is solved.

@simitt simitt removed this from the 8.7 milestone Feb 8, 2023
@simitt simitt removed the v8.7.0 label Feb 8, 2023
@StephanErb
Copy link

elastic/elasticsearch#93564 has been sovled. Is the issue ready to be tackled now or are there other remaining blockers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants