Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ServiceMonitor for scraping gitops-controller in rancher-monitoring chart #4101

Merged

Conversation

Tommy12789
Copy link
Contributor

@Tommy12789 Tommy12789 commented Jun 18, 2024

Issue:

The direct issue is here : rancher/fleet#2450

The PR that introduced metrics into fleet: rancher/fleet#2172

Problem

Enabling further additions to monitoring that are related to fleet metrics, for which reasons Prometheus needs to scrape the data of the gitops controller by creating an additional ServiceMonitor which points to the Kubernetes services created by the fleet chart, which in turn point to the gitops-controller metrics.

Solution

An additional ServiceMonitor needs to be created when the rancher-monitoring chart is installed, so that the thereby installed Prometheus instance is automatically configured to scrape the data of the gitops-controllers.

This enables further additions of monitoring capabilities to Rancher using the rancher-monitoring chart, for instance the addition of Prometheus alerts or Grafana dashboards. The latter may be embedded into Rancher, similarly as the Grafana dashboards are already embedded into Rancher and displayed through the Rancher UI when the rancher-monitoring chart is installed.

Testing

Engineering Testing

Manual Testing

Performed as described in Testing, including testing with sharding enabled in fleet.

Automated Testing

N/A

QA Testing Considerations

N/A

Regressions Considerations

The probability of this change introducing regressions is low, as it simply extends already implemented functionality by a rather simple resource, which is part of the rancher-monitoring-crd chart.

Backporting considerations

This change does not need to be backported to other versions. This is a new feature in fleet and no plans exist to backport it.

@Tommy12789 Tommy12789 requested a review from a team as a code owner June 18, 2024 12:07
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

@Tommy12789 Tommy12789 changed the title Gitops controller metric service monitor Add ServiceMonitor for scraping gitops-controller in rancher-monitoring chart Jun 19, 2024
@Tommy12789 Tommy12789 force-pushed the gitops-controller-metric-serviceMonitor branch from aaa07b2 to 594f1b6 Compare June 19, 2024 10:04
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

1 similar comment
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

@p-se
Copy link
Contributor

p-se commented Jun 19, 2024

This PR will enable scraping of metrics from Gitops controller that the performance dashboard will visualize. Fleet controller related metrics (including performance metrics) are already scraped, as a ServiceMonitor for the fleet controller already has been merged.

@p-se p-se force-pushed the gitops-controller-metric-serviceMonitor branch from 0a065ad to 9a714e1 Compare June 21, 2024 13:27
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

1 similar comment
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

@p-se p-se force-pushed the gitops-controller-metric-serviceMonitor branch from 332d6d7 to 0a07b0b Compare June 21, 2024 14:10
Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

Copy link

@thehejik thehejik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on v2.9-dcf32c221741b5b2d595d2c7ec2c44f5dd0c8fd2-head with rancher-monitoring:104.0.0-rc1+up45.31.1 and fleet:104.0.0+up0.10.0-rc.17.

There is new Prometheus target monitoring-gitops-controller present:
pasted_image

Also the target can be selected on Fleet/Controller-Runtime Dashboard in its job dropdown and it's showing graphs:
image

@p-se
Copy link
Contributor

p-se commented Jun 21, 2024

Relates to rancher/fleet#2460

Copy link

Validation steps

  • Ensure all container images have repository and tag on the same level to ensure that all container images are included in rancher-images.txt which are used by airgap customers.
  Ex:-
    longhorn-controller:
      repository: rancher/hardened-sriov-cni
      tag: v2.6.3-build20230913
  
  • Add a 👍 (thumbs up) reaction to this comment once done. CI won't pass without this reaction to the github-action bot's latest validation comment.
  • Approve the PR to run the CI check.

@thehejik thehejik merged commit bc88dd8 into rancher:dev-v2.9 Jun 25, 2024
6 checks passed
krunalhinguu pushed a commit to krunalhinguu/charts that referenced this pull request Jul 15, 2024
…ng chart (rancher#4101)

Co-authored-by: Patrick Seidensal <pseidensal@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants