Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: limit should always be the total #310

Closed
wants to merge 1 commit into from

Conversation

guilhem
Copy link
Contributor

@guilhem guilhem commented Nov 20, 2024

when node have maximum number of pvc it report 0, which is no limit.

https://github.com/container-storage-interface/spec/blob/master/spec.md#nodegetinfo

  // Maximum number of volumes that controller can publish to the node.
  // If value is not set or zero CO SHALL decide how many volumes of
  // this type can be published by the controller to the node. The
  // plugin MUST NOT set negative values here.
  // This field is OPTIONAL.
  int64 max_volumes_per_node = 2;

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Pull Request Guidelines:

  1. Does your submission pass tests?
  2. Have you added tests?
  3. Are you addressing a single feature in this PR?
  4. Are your commits atomic, addressing one change per commit?
  5. Are you following the conventions of the language?
  6. Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. Have you explained your rationale for why this feature is needed?
  8. Have you linked your PR to an open issue

@guilhem guilhem requested review from a team as code owners November 20, 2024 17:46
Copy link
Contributor

@komer3 komer3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Could provide more context behind this change and the reason?

@guilhem
Copy link
Contributor Author

guilhem commented Nov 20, 2024

hi @komer3
I added the CSI specification in PR description :)

Basically, when your node has a limit of 8 and has 8 PV mounted, it reports max_volumes_per_node = 0, which is "no limit".

Here CSI driver seems to try to replace scheduler decision :)

I'm also adding GCP CSI code to compare:
https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/a28f8d39b21ca8439d64445da33ed3d54a4dc67c/pkg/gce-pd-csi-driver/node.go#L664

https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/blob/a28f8d39b21ca8439d64445da33ed3d54a4dc67c/pkg/gce-pd-csi-driver/node.go#L507

@dthorsen
Copy link

Is it possible that a Linode could have block storage attachments outside of those managed by the CSI driver? I don't think this is possible via CAPL provisioned machines for example.

@guilhem
Copy link
Contributor Author

guilhem commented Nov 20, 2024

Yes, I see in tests that is it expect 7 on empty nodes.
Because we already have 1 PV already mounted.

@komer3
Copy link
Contributor

komer3 commented Nov 20, 2024

Ah I see. Thanks for providing the context!

We don't want to completely remove the listInstanceDisk() call. Reason for that being 8gb nodes can only have 7 additional pvc attached with 1 disk for boot that comes with it (listInstanceDisk will return more than 1 if we have some pvcs attached to that node). We want to provide exact number for allowed volume attachments in that given time. If we remove that logic, we would be returning theoretical max volume attachment number which would be incorrect representation of how many volumes can be attached. This will result in more errors with scheduling I believe.

For more context on why we have this logic please check this issue: #182

And check the corresponding PR here: #184

@komer3
Copy link
Contributor

komer3 commented Nov 20, 2024

Actually, let me correct myself here, swap disk also counts towards the total allowed disk attachment limit.

From akamai tech docs (https://techdocs.akamai.com/cloud-computing/docs/block-storage#limits-and-considerations):
A combined total of 8 storage devices can be attached to a Compute Instance at the same time, including local disks and Block Storage volumes. For example, if your Compute Instance has two main disks, root and swap, you can attach no more than 6 additional volumes to this Compute Instance.

Anyway, I think the problem we need to solve here is when we have no more space for disk attachment, we see the value being set to 0 which is may be incorrect.

Do we know an alternative value (other than 0) that indicates to the CO that we can no longer schedule workloads with pvc for that particular node (no more room for attachments)? I haven't been able to find it.

@guilhem guilhem force-pushed the pvc_limit branch 2 times, most recently from af014ae to f596131 Compare November 20, 2024 18:38
@guilhem
Copy link
Contributor Author

guilhem commented Nov 20, 2024

@komer3 I change myself and correct using prefix to filter PV from our driver :)

Comment on lines 391 to 402
// Filter out disks that are created by the driver
// to avoid counting them towards the max volume attachments
// that can be attached to the instance.
filterdDisks := slices.DeleteFunc(disks, func(disk linodego.InstanceDisk) bool {
// Check if the disk label starts with the volume label prefix
return strings.HasPrefix(disk.Label, ns.driver.volumeLabelPrefix)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want to filter out the volume that are attached to the node here. This would result in incorrect number of volumes that can be further attached. No?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CSI spec is unfortunately not as explicit as it could be. However, it seems intended that you return the maximum # of volumes that can be attached (irrespective of how many volumes are currently attached).

Example of the AWS EBS CSI driver calculating the volume limit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see here that the kube-scheduler calculates how many volumes are currently attached and compares that to the limit (which we return here) of how many total volumes can be attached.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch. Thanks for sharing that. Yeah looking into it further I agree with you. We should make sure that we do not count for attached volumes (only the disks - boot and swap or just boot in LKE).

But I believe we are doing that correctly already. ListInstanceDisk() only return the disks and do not include the volumes attached to the node. So we don't really need to filter for it.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.79%. Comparing base (4ecf4c4) to head (65e681c).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #310   +/-   ##
=======================================
  Coverage   74.79%   74.79%           
=======================================
  Files          22       22           
  Lines        2396     2396           
=======================================
  Hits         1792     1792           
  Misses        499      499           
  Partials      105      105           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@guilhem guilhem marked this pull request as draft November 20, 2024 20:48
Copy link
Collaborator

@wbh1 wbh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch recognizing that we were implementing the CSI spec incorrectly and not returning a consistent maximum number of volumes that can be attached!

Example of AWS's EBS CSI driver calculating the volume limit, for the curious

// Filter out disks that are created by the driver
// to avoid counting them towards the max volume attachments
// that can be attached to the instance.
filterdDisks := slices.DeleteFunc(disks, func(disk linodego.InstanceDisk) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
filterdDisks := slices.DeleteFunc(disks, func(disk linodego.InstanceDisk) bool {
filteredDisks := slices.DeleteFunc(disks, func(disk linodego.InstanceDisk) bool {

Typo here (and other locations where the variable is referenced)

@guilhem guilhem closed this Nov 20, 2024
@linode linode deleted a comment from j-zimnowoda Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants