-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added telemetry with most common error from agent logs #146107
Merged
juliaElastic
merged 4 commits into
elastic:main
from
juliaElastic:telemetry/agent-logs-errors
Nov 29, 2022
Merged
added telemetry with most common error from agent logs #146107
juliaElastic
merged 4 commits into
elastic:main
from
juliaElastic:telemetry/agent-logs-errors
Nov 29, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@elasticmachine merge upstream |
Pinging @elastic/fleet (Team:Fleet) |
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]Unknown metric groupsESLint disabled in files
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
kpollich
approved these changes
Nov 28, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Awesome change.
juliaElastic
added a commit
to juliaElastic/kibana
that referenced
this pull request
Nov 29, 2022
## Summary Closes elastic/ingest-dev#1261 Merged: [elasticsearch change](elastic/elasticsearch#91701) to give kibana_system the missing privilege to read logs-elastic_agent* indices. ## Top 3 most common errors in the Elastic Agent logs Added most common elastic-agent and fleet-server logs to telemetry. Using a query of message field using sampler and categorize text aggregation. This is a workaround as we can't directly do aggregation on `message` field. ``` GET logs-elastic_agent*/_search { "size": 0, "query": { "bool": { "must": [ { "term": { "log.level": "error" } }, { "range": { "@timestamp": { "gte": "now-1h" } } } ] } }, "aggregations": { "message_sample": { "sampler": { "shard_size": 200 }, "aggs": { "categories": { "categorize_text": { "field": "message", "size": 10 } } } } } } ``` Tested with latest Elasticsearch snapshot, and verified that the logs are added to telemetry: ``` { "agent_logs_top_errors": [ "failed to dispatch actions error failed reloading q q q nil nil config failed reloading artifact config for composed snapshot.downloader failed to generate snapshot config failed to detect remote snapshot repo proceeding with configured not an agent uri", "fleet-server stderr level info time message No applicable limit for agents using default \\n level info time message No applicable limit for agents using default \\n", "stderr panic close of closed channel n ngoroutine running Stop" ], "fleet_server_logs_top_errors": [ "Dispatch abort response", "error while closing", "failed to take ownership" ] } ``` Did some measurements locally, and the query took a few ms only. I'll try to check with larger datasets in elastic agent logs too. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
juliaElastic
added a commit
that referenced
this pull request
Nov 29, 2022
For reference, this actually made it into 8.6.0. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport:skip
This commit does not require backporting
release_note:skip
Skip the PR/issue when compiling release notes
Team:Fleet
Team label for Observability Data Collection Fleet team
v8.6.1
v8.7.0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Closes https://github.com/elastic/ingest-dev/issues/1261
Merged: elasticsearch change to give kibana_system the missing privilege to read logs-elastic_agent* indices.
Top 3 most common errors in the Elastic Agent logs
Added most common elastic-agent and fleet-server logs to telemetry.
Using a query of message field using sampler and categorize text aggregation. This is a workaround as we can't directly do aggregation on
message
field.Tested with latest Elasticsearch snapshot, and verified that the logs are added to telemetry:
Did some measurements locally, and the query took a few ms only. I'll try to check with larger datasets in elastic agent logs too.
Checklist