Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prometheus Dashboard #13126

Merged
merged 6 commits into from
Aug 8, 2019
Merged

Conversation

sorantis
Copy link
Contributor

In an effort to increase our dashboard coverage for all modules (#10594) I've created a Prometheus dashboard that shows the following metrics:

  • prometheus_http_requests_total: Counter of HTTP requests.
  • prometheus_engine_query_duration_seconds: Query timings for (99th percentile).
  • tsdb_wal_truncations_failed_total: Total number of WAL truncations that failed.
  • prometheus_tsdb_wal_corruptions_total: Total number of WAL corruptions.
  • prometheus_tsdb_reloads_total: Number of times the database reloaded block data from disk.
  • prometheus_tsdb_reloads_failures_total: Number of times the database failed to reload block data from disk.
  • prometheus_sd_discovered_targets: Current number of discovered targets.
  • prometheus_tsdb_head_chunks: Total number of chunks in the head block.
  • prometheus_api_remote_read_queries: The current number of remote read queries being executed or waiting.
  • prometheus_notifications_queue_capacity: The capacity of the alert notifications queue.
  • prometheus_notifications_queue_length: The number of alert notifications in the queue.

The dashboards looks like this:
Screen Shot 2019-07-30 at 22 36 08

@sorantis sorantis requested review from a team as code owners July 31, 2019 16:40
@sorantis sorantis self-assigned this Jul 31, 2019
@sorantis sorantis added Team:Integrations Label for the Integrations team enhancement Metricbeat Metricbeat labels Jul 31, 2019
@exekias
Copy link
Contributor

exekias commented Aug 1, 2019

Thank you for opening this! I'm curious, it looks like you are graphing totals for http requests and head chunks. Would it make sense to calculate the rates?

Can you add a changelog entry to CHANGELOG.next.asciidoc?

Tests are failing because of docs, that should be fixed by running make update in metricbeat folder

@sorantis
Copy link
Contributor Author

sorantis commented Aug 1, 2019

The prometheus_tsdb_head_chunks metric used for showing head chunks is a gauge. AFAIK the rate are not used on gauges. We could show rates for the HTTP total requests though.

Thanks for the tip, I'll update the changelog and build docs.

@exekias
Copy link
Contributor

exekias commented Aug 1, 2019

The prometheus_tsdb_head_chunks metric used for showing head chunks is a gauge. AFAIK the rate are not used on gauges. We could show rates for the HTTP total requests though.

Indeed, sorry I didn't check prometheus_tsdb_head_chunks, that's fine then 👍

@exekias
Copy link
Contributor

exekias commented Aug 1, 2019

Docs were removed on this commit: 48cd11f could you add them back?

@kaiyan-sheng
Copy link
Contributor

travis-ci finally passed after a lot of retries 😂

@kaiyan-sheng
Copy link
Contributor

jenkins, test this please

Copy link
Contributor

@kaiyan-sheng kaiyan-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I manually loaded this dashboard and it works.

@sorantis sorantis merged commit 0dd689e into elastic:master Aug 8, 2019
@andresrc andresrc added v7.4.0 test-plan Add this PR to be manual test plan needs testing notes labels Aug 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Metricbeat Metricbeat needs testing notes review Team:Integrations Label for the Integrations team test-plan Add this PR to be manual test plan v7.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants