Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Fix] AttachVolume error where volume region is empty #301

Merged
merged 5 commits into from
Nov 12, 2024

Conversation

komer3
Copy link
Contributor

@komer3 komer3 commented Nov 6, 2024

volumeContext[VolumeTopologyRegion] was not a reliable source to get volumes region. We saw multiple instances where this was returning an empty string which was causing the region mismatch failure. Switching to just use volume obj returned by API to validate is a better and more robust approach since LinodeVolume Obj returned by the API will always have the correct region field.

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Pull Request Guidelines:

  1. Does your submission pass tests?
  2. Have you added tests?
  3. Are you addressing a single feature in this PR?
  4. Are your commits atomic, addressing one change per commit?
  5. Are you following the conventions of the language?
  6. Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. Have you explained your rationale for why this feature is needed?
  8. Have you linked your PR to an open issue

…e to get volumes region. Switching to just use volume obj returned by API to validate is a better and more robust approach
@komer3 komer3 requested review from a team as code owners November 6, 2024 15:36
Copy link

codecov bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 74.83%. Comparing base (12e5659) to head (b40379a).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/driver/controllerserver_helper.go 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #301   +/-   ##
=======================================
  Coverage   74.83%   74.83%           
=======================================
  Files          22       22           
  Lines        2356     2356           
=======================================
  Hits         1763     1763           
  Misses        491      491           
  Partials      102      102           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@AshleyDumaine AshleyDumaine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@srust srust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the instance region and the volume region are actually different? How does a user recover from this situation, or get themselves into this situation?

@komer3
Copy link
Contributor Author

komer3 commented Nov 8, 2024

@srust The region mismatch error serves as a critical validation check (sort of double checking). This issue typically surfaces only in specific multi-region scenarios where cluster configuration may be incomplete.

Primary Scenario

This issue occurs when:

  • The multi-region feature is enabled. Cluster nodes lack proper region labels because of which the CSI driver cannot accurately determine the target region for volume creation

Example Case

Consider this setup:

  • Control plane with CSI driver in ORD
  • Remote worker node pools in MIA and IAD
  • Nodes in IAD missing region labels
  • User creates a deployment with PVCs targeted for IAD

In this case, the CSI driver defaults to creating the volume in ORD (fallback region) instead of the intended IAD region, triggering the mismatch error.

Note: This error should only manifest when utilizing the CSI driver's multi-region capabilities.

This error should not be triggered with how we currently use and deploy CSI driver. Replacing the use of VolumeContext should make this check more reliable and not trigger anymore unless in very specific cases such as the one I mentioned above.

Hope that answers your question! :)

@komer3 komer3 merged commit d6ec6a8 into main Nov 12, 2024
7 of 8 checks passed
@komer3 komer3 deleted the volume-attach-fix branch November 12, 2024 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants