Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add pending connections count to metrics #2713

Merged
merged 2 commits into from
Sep 24, 2024

Conversation

2color
Copy link
Contributor

@2color 2color commented Sep 22, 2024

Title

Description

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

@2color 2color requested a review from a team as a code owner September 22, 2024 19:29
@achingbrain
Copy link
Member

achingbrain commented Sep 23, 2024

Given that this value can change rapidly between scrapes, do you think this might be better off as a counter rather than a gauge?

It should probably also have the _total suffix - see Metric and Label naming.

@2color
Copy link
Contributor Author

2color commented Sep 23, 2024

Counters can't go down. So we can't use them here since pending requests can go both up and down.

https://prometheus.io/docs/tutorials/understanding_metric_types/#counter

@achingbrain
Copy link
Member

achingbrain commented Sep 23, 2024

Counters can't go down

This is true - but when graphing them, we can use a rate() function to show their change over time. This way we don't miss spikes between scrapes.

In Graphana you add a Range Function and select Rate:

image

@2color
Copy link
Contributor Author

2color commented Sep 23, 2024

To pick between counter and gauge, there is a simple rule of thumb: if the value can go down, it is a gauge.
Counters can only go up (and reset, such as when a process restarts). They are useful for accumulating the number of events, or the amount of something at each event. For example, the total number of HTTP requests, or the total number of bytes sent in HTTP requests. Raw counters are rarely useful. Use the rate() function to get the per-second rate at which they are increasing. https://prometheus.io/docs/practices/instrumentation/#counter-vs-gauge-summary-vs-histogram

If at t(0) we have 100 pending connections and at t(1) we have 80 pending connections, how can capture that with a counter?

This way we don't miss spikes between scrapes.

I think catching spikes is a matter of the scrape interval. We could add a counter to track the total incoming connections (prior to upgrade unlike libp2p_connection_manager_connections which only counts upgraded connections), but that would still not necessarily help catch spikes between scrapes.

@wemeetagain
Copy link
Member

Are pending connections only inbound?

We could use a histogram or summary for the # of pending connections observed when a new inbound connection is triggered.

We can also have a histogram/summary to track time from pending to fully connected.

@achingbrain
Copy link
Member

How about adding the pending connections count to the existing inbound/outbound connections metric as a label?

Then it'd graph with those automatically which might make things easier to observe? This is similar to the "inbound unnegotiated" and "outbound unnegotiated" stream metrics.

@wemeetagain
Copy link
Member

How about adding the pending connections count to the existing inbound/outbound connections metric as a label?

Sounds good to me from a consistency perspective.

Copy link
Contributor Author

@2color 2color left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the latest changes

@achingbrain achingbrain changed the title feat: add gauge for pending connections feat: add pending connections count to metrics Sep 24, 2024
@achingbrain achingbrain merged commit b3272cf into main Sep 24, 2024
21 checks passed
@achingbrain achingbrain deleted the feat/add-pending-conns-metric branch September 24, 2024 06:15
@achingbrain achingbrain mentioned this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants