-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - Upgrading an existing Nebari AWS environment to 2023.7.1 causes the cluster to be destroyed/recreated #1884
Comments
Thanks @sblair-metrostar for reporting this! For now we updated the release notes to reflect this breaking change. |
@sblair-metrostar regarding "Option 2: A check is made prior to deploying changes which would block deployment in destructive scenarios such as this." I think regardless of what we do end up with this would be helpful to have. Since as you've said a single change can have a huge effect. We should probably also establish things that cannot be deleted. E.g.:
These checks would be easy enough to include within the |
@costrouc Agreed, I think a destruction safety check would be tremendously valuable. I only referred to it as an option because, at least in my mind, it's the much harder one to implement in a general case, maybe not so bad if it's just a one-off check made for known destructive PR's like this. Unfortunately, just checking for it still leaves the user in a position to manually remediate whatever the situation is or just brace for a backup/restore in order to complete the upgrade. Actually handling the state migration for a seamless upgrade with I'd like to think copious warnings displayed to the user about what's happening would be ideal. Being able to generate a plan through the nebari CLI would also be nice so we can inspect the pending changes as part of a PR or something. |
Two additional actions to move ahead:
|
Issue addressed by adding to release notes. I opened a new issue about critical resource protections - #2829 |
Describe the bug
Deploying the upgrade of an existing AWS 2023.5.1 Nebari to 2023.7.1 resulted in Terraform destroying and recreating the network resources (VPC/subnets), along with everything attached to them including the EKS cluster. This appears to have been due to changes made in a recent enhancement intended to permit use of existing subnets.
module.network was renamed to module.network[0] without any deliberate attempt to move the state, and lacking any automated support from Terraform for handling this case, it was replaced without warning.
Expected behavior
Option 1 (Preferred): module.network state records are migrated to the new format prior to applying Terraform changes. The
moved
option was added in Terraform 1.1, but Nebari is currently pinned to 1.0.5 so an upgrade would be required to utilize this.Option 2: A check is made prior to deploying changes which would block deployment in destructive scenarios such as this.
OS and architecture in which you are running Nebari
Ubuntu Linux, x64
How to Reproduce the problem?
nebari upgrade -c nebari-config.yaml
nebari render -c nebari-config.yaml
nebari deploy -c nebari-config.yaml
Command output
No response
Versions and dependencies used.
Nebari: 2023.5.1 -> 2023.7.1
Kubectl: 1.25
Conda: 23.5.0
Compute environment
AWS
Integrations
No response
Anything else?
Workaround:
nebari render...
the 2023.7.1 update changespushd stages/02-infrastructure/aws
terraform state mv module.network module.network[0]
popd
nebari deploy...
to deploy the 2023.7.1 upgradeThe text was updated successfully, but these errors were encountered: