-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only metrics #2557
Only metrics #2557
Conversation
be60ec1
to
7bc711b
Compare
cb8b7d8
to
1362702
Compare
1362702
to
364264e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, thanks @lnhanks !
364264e
to
8531551
Compare
8531551
to
8a9180f
Compare
8a9180f
to
c1db805
Compare
) | ||
eniIPsInUse = prometheus.NewGaugeVec( | ||
prometheus.GaugeOpts{ | ||
Name: "awscni_eni_util", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this metric should be named awscni_assigned_ip_per_eni
, since it serves similar purpose of awscni_assigned_ip_per_cidr
metric(expect the label value is eni instead of cidr).
Also, util
is not a good abbreviation for "utilization"( i assume you mean utilization here :D)
@@ -122,6 +122,19 @@ var ( | |||
}, | |||
[]string{"cidr"}, | |||
) | |||
noAvailableIPAddrs = prometheus.NewCounter( | |||
prometheus.CounterOpts{ | |||
Name: "awscni_err_no_avail_addrs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe change to awscni_no_available_ip_addresses
, which
- aligns with other metrics's use of "ip_addresses".
- and we don't need to have this "err" in metrics name, since it's a expected behavior instead of err.
- remove nonstandard use of abbreviation(avail is not a standard abbreviation for available iirc)
* restore node update permission to master until image tag can be updated (#2513) * Merge branch 'release-1.14' (#2517) * network policies update to readme (#2478) * init draft of network policy desc * add security note * fixup * fixup * fix placeholder link * Update manifest for cni 1.14 (#2526) * Mimic VPC-RC limit struture (#2516) * limits api pkg (#2528) * Update kops tests for 1.28 and fix generate-cni-yaml script (#2536) * skip IPAMD events test (#2537) * chore: remove refs to deprecated io/ioutil (#2541) * Change default Node Agent ports for health and metrics (#2545) * remove self-managed node group from pod-eni test suite (#2547) * bump controller runtime to 0.16.1 (#2548) Co-authored-by: Joseph Chen <chenjoez@amazon.com> * update agent image (#2554) * fix(chart): Switch base64 encoded cniConfig.fileContents to the binaryData (#2552) * Update the use of privileged flag in aws-vpc-cni manifest (#2555) * increment default Calico version for helm compatibility (#2560) * update nginx image (#2561) * Only metrics (#2557) Prometheus metrics for capturing ENI IP usage and no available IP address errors Co-authored-by: Lindsay Hanks <lnhanks@dev-dsk-lnhanks-2a-167bac85.us-west-2.amazon.com> * CHANGELOG, chart, and manifest updates for VPC CNI v1.15.0 release (#2563) * remove calico test suite from weekly integration tests (#2559) * remove addon-tests integration suite as it is no longer needed (#2564) * Only metrics (#2569) * rename warm pool metrics --------- Co-authored-by: Lindsay Hanks <lnhanks@dev-dsk-lnhanks-2a-167bac85.us-west-2.amazon.com> * Fix unused version variable (#2566) * Update example table 'Pod per Prefixes' value (#2573) * Bandwidth plugin with NP is currently unsupported (#2572) * Bandwidth plugin with NP * Messaging review * pass CNINode scheme to client only (#2570) * reduce api calls (#2575) * Add region flag to describe-addon command (#2576) * add ENABLE_V4_EGRESS (#2577) * Add test registry parameter for ipv6 and CNI full tests (#2585) * update golang image (#2586) * increase time for service readiness (#2587) * do not patch CNINode for custom networking unless podENI is enabled (#2591) * Remove self-managed node group from custom-networking suite (#2590) * remove self-managed node group from custom-networking suite * Select CNI manifest based on regions (#2593) * Update metrics helper image url based on region (#2604) * dependabot updates (#2605) * Graceful termination for service connectivity tests (#2611) * update CHANGELOG, charts, and manifests in master following v1.15.1 release (#2614) * go module updates and golang builder image update (#2615) * update Golang to 1.21.3 (#2616) * Stricter dependency/security review (#2617) * Stricter dependency/security review Signed-off-by: Davanum Srinivas <davanum@gmail.com> * move common things to a separate file Signed-off-by: Davanum Srinivas <davanum@gmail.com> --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> * update actions for go 1.21 and fix deps action warnings (#2618) --------- Signed-off-by: Davanum Srinivas <davanum@gmail.com> Co-authored-by: Jay Deokar <23660509+jaydeokar@users.noreply.github.com> Co-authored-by: Geoffrey Cline <geoffreyc@outlook.com> Co-authored-by: Joseph Chen <76720045+jchen6585@users.noreply.github.com> Co-authored-by: guangwu <guoguangwu@magic-shield.com> Co-authored-by: Joseph Chen <chenjoez@amazon.com> Co-authored-by: Valentin Zayash <VLZZZ@users.noreply.github.com> Co-authored-by: lnhanks <67074258+lnhanks@users.noreply.github.com> Co-authored-by: Lindsay Hanks <lnhanks@dev-dsk-lnhanks-2a-167bac85.us-west-2.amazon.com> Co-authored-by: 김은빈 <rlaisqls@gmail.com> Co-authored-by: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com> Co-authored-by: Davanum Srinivas <davanum@gmail.com>
What type of PR is this?
Which issue does this PR fix:
n/a
What does this PR do / Why do we need it:
This PR adds to additional metrics to better visualize IP allocation. No available addresses error counter metric is currently a logged error but turning it into a metric will help visualize how often an IP allocation fails due to there being no addresses available. The second gauge metric ENI utilization expands on the existing ENIs allocated metric by partitioning data by ENI id and counting how many IP addresses are in use on each ENI. There are also two additional log statements created when an IP address is allocated or deallocated.
If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
Testing done on this change:
Automation added to e2e:
No
Will this PR introduce any new dependencies?:
No
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
Yes
Does this change require updates to the CNI daemonset config files to work?:
No
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.