-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PVC reconciliation error causes deletion and data loss #730
Comments
Hi @mcgrawia , operator supports PVC resize if your storage class is expandable. This is a basic use case. Managing PVCs and other CHI-related objects outside of operator is strongly non-recommended. Since 0.14.0 operator reconciles the CHI definition with actual state in k8s, so if any mismatch in PVC is detected it tries to 'fix' it. There is a guard against PVC deletion that you can enable in current version. Use 'reclaimPolicy: Retain' for PVC definition in CHI. This is not a standard k8s attribute for PVC, it only works inside the operator. It protects volumes from being deleted. Note, volumes will not be deleted when dropping replicas and cluster completely in this case. That's said, upcoming version 0.15.0 addresses your issue better. Operator would not delete PVC if it fails to modify it for whatever reason. |
Hi @alex-zaitsev thanks for the response. That makes sense that the operator now reconciles with the K8s state. Do you have any suggestions on how we can upgrade our prod CHI to prevent the PVC deletion? We first deployed the cluster with operator 0.11 or 0.12 when it did not support PVC resizing, and we followed this issue to manually increase our PVCs: #155 (comment). |
Hi @mcgrawia , Storage re-scaling is supported since operator 0.10 (check release notes). In order to protect your PVC from the deletion, you have two options, as I suggested earlier:
|
Apologies @alex-zaitsev ! I completely mis-read that prior message. The |
Hi operator team,
My team and I just came across this issue with the CHOP where if we have a PVC reconciliation error, the CHOP deletes the volume and causes data loss.
How to reproduce:
Create a CHI with the following using PVCs:
This should stand up as expected. Once it's available, edit the above definition so that the
data-storage-vc-template
requestsstorage: 2Gi
and reapply. Since the Docker for Mac PVC provisioner does not support dynamic resizing, this should show the following error:Shortly after that, the operator will attempt to delete the PVC:
When the operator is finished, you can see the PVC is "Terminating" in kubectl:
This is a significant issue because the next time the operator attempts to update the statefulset, it will delete the pod and wipe the data. This is especially significant for our production cluster because the operator did not used to support PVC resizing, so we manually increased the PVCs. This means the next time we deploy to prod, the operator is going to delete all of our prod volumes because they do not match the CHI spec.
It seems in previous versions of the operator, this was not the behavior. We had deployed many times with mismatched PVC sizes with no issue. Is there anything we can do to prevent our prod cluster from being wiped or are we forced to downgrade the operator to a previous version?
Digging through the commits, it seems like the issue could have been introduced in da43217 when the
return
was added inreconcilePVCs()
. Thereturn
prevents the PVC from being registered and causes it to not show up in theneed
list of resources here.Thanks!
The text was updated successfully, but these errors were encountered: