Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Node Server Metrics Implementation #278

Merged
merged 35 commits into from
Oct 16, 2024
Merged

Conversation

prajwalvathreya
Copy link
Contributor

@prajwalvathreya prajwalvathreya commented Oct 2, 2024

This PR is to get the metrics of the functions performed by the node server. The functions for which the metrics are implemented are:

NodePublishVolume
NodeUnpublishVolume
NodeStageVolume
NodeUnstageVolume
NodeExpandVolume

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Pull Request Guidelines:

  1. Does your submission pass tests?
  2. Have you added tests?
  3. Are you addressing a single feature in this PR?
  4. Are your commits atomic, addressing one change per commit?
  5. Are you following the conventions of the language?
  6. Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. Have you explained your rationale for why this feature is needed?
  8. Have you linked your PR to an open issue

Copy link

codecov bot commented Oct 2, 2024

Codecov Report

Attention: Patch coverage is 55.05618% with 40 lines in your changes missing coverage. Please review.

Project coverage is 75.31%. Comparing base (0b18622) to head (9f86018).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/driver/nodeserver.go 29.72% 26 Missing ⚠️
internal/driver/server.go 67.74% 8 Missing and 2 partials ⚠️
main.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #278      +/-   ##
==========================================
- Coverage   76.33%   75.31%   -1.03%     
==========================================
  Files          21       22       +1     
  Lines        1644     1730      +86     
==========================================
+ Hits         1255     1303      +48     
- Misses        289      325      +36     
- Partials      100      102       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@prajwalvathreya prajwalvathreya marked this pull request as ready for review October 14, 2024 18:24
@prajwalvathreya prajwalvathreya requested review from a team as code owners October 14, 2024 18:24
@prajwalvathreya prajwalvathreya changed the title - added prometheus scraping to one function in nodeserver.go to test … - Node Server Metrics endpoint Oct 14, 2024
- fixed `ineffassign` error for variable `success`
@prajwalvathreya prajwalvathreya changed the title - Node Server Metrics endpoint - [Feat] Node Server Metrics Implementation Oct 14, 2024
@komer3 komer3 changed the title - [Feat] Node Server Metrics Implementation [Feat] Node Server Metrics Implementation Oct 14, 2024
volumeID := req.GetVolumeId()
log.V(2).Info("Processing request", "volumeID", volumeID)

ns.mux.Lock()
defer ns.mux.Unlock()

success := metrics.SuccessTrue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this success variable indicating? That the controller was successful in publishing the volume(basically whatever task it was attempting)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the variable indicates if the function was completed. true is completion with no errors. false indicates that there was an error and it's trying to execute the function again until completion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function completes on line 156 though. Or Am I missing something?
Also can we rename start and success in all the methods to something more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. So initially the success variable is set to true, but on each failure it's set as false and stored into prometheus's registrar. If the execution goes to line 156 without any errors, success will be true meaning the function was completed.

Sure, I'll rename the variables.

- refactored `start` to `functionStartTime`
- refactored `success` to `functionStatus`
- added service to expose the metrics to prometheus
- defaulted the node-server port to 10251
- updated documentation with the new helm command
- updated daemonset to pick up the environment variables to enable the metrics
Copy link
Contributor

@komer3 komer3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on this!

@prajwalvathreya prajwalvathreya merged commit 7052bd1 into main Oct 16, 2024
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants