-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover expansion failure #106154
Recover expansion failure #106154
Conversation
efc63c8
to
199ea31
Compare
This PR may require API review. If so, when the changes are ready, complete the pre-review checklist and request an API review. Status of requested reviews is tracked in the API Review project. |
cc @kubernetes/sig-storage-api-reviews |
/remove-sig api-machinery |
/triage accepted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API changes look good overall, just small picks. When this is all LGTM I can approve.
@@ -472,6 +472,9 @@ type PersistentVolumeClaimSpec struct { | |||
// +optional | |||
Selector *metav1.LabelSelector `json:"selector,omitempty" protobuf:"bytes,4,opt,name=selector"` | |||
// Resources represents the minimum resources the volume should have. | |||
// If RecoverVolumeExpansionFailure is enabled users are allowed to specify resource requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unclear what If RecoverVolumeExpansionFailure is enabled
means - can you say something like "If the RecoverVolumeExpansionFailure feature gate is enabled..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
@@ -526,6 +529,25 @@ const ( | |||
PersistentVolumeClaimFileSystemResizePending PersistentVolumeClaimConditionType = "FileSystemResizePending" | |||
) | |||
|
|||
type PersistentVolumeClaimResizeStatus string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added +enum
here. I wonder if I should have waited for other feature, we will see.
|
||
const ( | ||
// When expansion is complete, the empty string is set by resize controller or kubelet. | ||
PersistentVolumeClaimNoExpansionInProgress PersistentVolumeClaimResizeStatus = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why empty string and not nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my original idea was, I thought I needed an invariant which I can use as default (without risking dereferencing a nil
). This would also serve as a terminal condition when resizing is complete (for informing user). But I see how this can be problematic, I have marked this as a follow up item - #106429 before the feature goes beta.
// lowered if there are no expansion operations in progress and if the actual volume capacity | ||
// is equal or lower than the requested capacity. | ||
// This is an alpha field and requires enabling RecoverVolumeExpansionFailure feature. | ||
// +optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+featureGate=RecoverVolumeExpansionFailure
here and below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// +optional | ||
AllocatedResources ResourceList `json:"allocatedResources,omitempty" protobuf:"bytes,5,rep,name=allocatedResources,casttype=ResourceList,castkey=ResourceName"` | ||
// ResizeStatus stores status of resize operation. | ||
// Defaults to nil but when expansion is complete resizeStatus is set to empty string by resize controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"nil" is not meaningful to API clients. How about "Is not set by default" or "Has no value by default"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
017b897
to
393451e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API changes LGTM
default: | ||
newSize = pvcSpecSize | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if pvSize.Cmp(pvcSpecSize) == 0, I am wondering the following logic should be the same? Or it could be skipped and do nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment below covers why the code is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the following logic for cases of pvSize.Cmp(pvcSpecSize) == 0 and pvSize.Cmp(pvcSpecSize) > 0 might be different. Checking the code below, it should be ok, just in case of pvSize.Cmp(pvcSpecSize) == 0, expansion should be already finished successfully there are some calls could be skipped (such as MarkControllerReisizeInProgress, ExpandVolumeDevice, and UpdatePVSize)?
I am ok to optimize this later..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing is - equality comparison of resource.Quantity
is sometimes problematic, esp. because size stored in PV is size reported by volume plugin after expansion. A plugin may align size request to GB/GiB/MB boundaries and return updated size which is greater than what user request by just a fraction and hence I am not special casing equality of pvSize
and pvcSpecSize
. But I agree we could revisit this later. I will update GH issue I have created to track feature going beta.
}, | ||
{ | ||
name: "pvc.spec.size = pv.spec.size, recover_expansion=on", | ||
pvc: getTestPVC("test-vol0", "1G", "1G", "", v1.PersistentVolumeClaimNoExpansionInProgress), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, I feel resize should not be triggered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is indeed odd. If there is no expansion in progress, should it just return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this falls under - we should have not ended up in this state ever
What happened is - pvc.status.cap < pvc.spec.cap
and pvc.spec.cap
== pv.spec.cap
and yet resizeStatus
is empty. The test and condition exists for completeness purpose, because we still need to decide what to do when this happens.In this case IMO - it is best to let whole resize operation run and set appropriate resizeStatus
on pvc and let node adjust pvc.status.cap
if necessary. Since we don't shrink volumes ever and resize operations are idempotent, we should be fine and k8s should correct this weird state.
Edited - The reference to line# was incorrect, fixed it. But my main reasoning is still the case.
recoverFeatureGate: true, | ||
expectedResizeStatus: v1.PersistentVolumeClaimNodeExpansionPending, | ||
expectedAllocatedSize: resource.MustParse("2G"), | ||
expectResizeCall: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also check expected pv.spec.size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have filed this as a follow up in - #106429 and I will make sure we test pv size in external-resizer.
case v1.PersistentVolumeClaimControllerExpansionInProgress: | ||
case v1.PersistentVolumeClaimControllerExpansionFailed: | ||
case v1.PersistentVolumeClaimNoExpansionInProgress: | ||
// This is case#2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This can be case #4 if user has shrunk the volume while controller expansion was in progress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
almost LGTM from storage POV |
/lgtm |
393451e
to
1ddd598
Compare
/lgtm |
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, jsafrane, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/milestone v1.23 |
if volumeToMount.VolumeSpec.ReadOnly { | ||
simpleMsg, detailedMsg := volumeToMount.GenerateMsg("MountVolume.NodeExpandVolume failed", "requested read-only file system") | ||
klog.Warningf(detailedMsg) | ||
og.recorder.Eventf(volumeToMount.Pod, v1.EventTypeWarning, kevents.FileSystemResizeFailed, simpleMsg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change here seems to generate warning event for all Pods which are mounting an PVC in read-only mode, regardless of whether there was an actual PVC/filesystem resize request or not. Didn't used to be the case, before v1.23.
Before this change, pvcStatusCap.Cmp(pvSpecCap) < 0
check used to consider resize only for volumes for which it was actually requested.
we wouldn't have noticed this, if this was just logging, but events are consumed by higher level tools, and is confusing our users, even though it is an non-issue.
Implement changes for recovering from volume expansion failure.
xref kubernetes/enhancements#1790
Note to reviewers - there is a detailed flow chart diagram attached in the KEP - https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1790-recover-resize-failure/expansion_flow.pdf which describes the full control flow.