-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a metric to provide missed events per type #1674
Conversation
✅ Deploy Preview for tetragon ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
1c5ca91
to
0f24125
Compare
0f24125
to
a2ab538
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments :) The most important one being that it looks like we want to expose two different metric names here.
Signed-off-by: Anastasios Papagiannis <tasos.papagiannnis@gmail.com>
a2ab538
to
097b35f
Compare
Thanks for the review! I have made the changes that you proposed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think this will help with debugging.
I have some small comments, PTAL.
1c293ad
to
f294f5d
Compare
ad0ea1c
to
d1c041a
Compare
Example: $ curl localhost:2112/metrics 2> /dev/null | grep 'sent_events_total\|missed_events_total\|ringbuf_perf_event_lost_total\|ringbuf_queue_lost_total\|msg_op_total\|ringbuf_queue_received_total' tetragon_missed_events_total{msg_op="13"} 73300 tetragon_missed_events_total{msg_op="23"} 28 tetragon_missed_events_total{msg_op="24"} 606 tetragon_missed_events_total{msg_op="5"} 20 tetragon_missed_events_total{msg_op="7"} 22 tetragon_msg_op_total{msg_op="13"} 4.268532e+06 tetragon_msg_op_total{msg_op="23"} 12444 tetragon_msg_op_total{msg_op="24"} 2110 tetragon_msg_op_total{msg_op="5"} 11908 tetragon_msg_op_total{msg_op="7"} 12447 tetragon_ringbuf_perf_event_lost_total 73976 tetragon_ringbuf_queue_lost_total 0 tetragon_ringbuf_queue_received_total 4.307441e+06 This PR adds an eBPF map collector for getting metrics directly from a map. This map contains information about the return values of all perf_event_output calls (i.e. if it fails). This provides us the ability to determine missed events per type. Metric tetragon_missed_events_total contains such information. Using the previous example, we can see that we lost 73976 events from the user-space (tetragon_ringbuf_perf_event_lost_total). This is the same as the sum of all tetragon_missed_events_total metrics gathered from the kernel. Signed-off-by: Anastasios Papagiannis <tasos.papagiannnis@gmail.com>
d1c041a
to
19778a4
Compare
Thanks! I think this will be very useful to have, which is why I added a backport label for 1.0. |
Backport PR: #1702 |
Example:
This PR adds an eBPF map collector for getting metrics directly from a map. This map contains information about the return values of all
perf_event_output
calls (i.e. if it fails). This provides us the ability to determine missed events per type. Metrictetragon_missed_events_total
contains such information.Using the previous example, we can see that we lost 73976 events from the user-space (
tetragon_ringbuf_perf_event_lost_total
). This is the same as the sum of alltetragon_missed_events_total
metrics gathered from the kernel.