Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1873288: server: Target the spec configuration if we have at least one node #2035

Merged
merged 1 commit into from
Nov 6, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
server: Target the spec configuration if we have at least one node
The CI cluster hit an issue where a pull secret was broken, and
then we hit a deadlock because the MCO failed to drain nodes on
the old config, because other nodes on the old config couldn't
schedule the pod.

It just generally makes sense for new nodes to use the new config;
do so as long as at least one node has successfully joined the
cluster at that config.  This way we still avoid breaking
the cluster (and scaleup) with a bad config.
  • Loading branch information
cgwalters committed Aug 27, 2020
commit 4bd204d3e851eb2fa6cb80460fcb5b82ab0c96dc
11 changes: 10 additions & 1 deletion pkg/server/cluster_server.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,16 @@ func (cs *clusterServer) GetConfig(cr poolRequest) (*runtime.RawExtension, error
return nil, fmt.Errorf("could not fetch pool. err: %v", err)
}

currConf := mp.Status.Configuration.Name
// For new nodes, we roll out the latest if at least one node has successfully updated.
// This avoids deadlocks in situations where the old configuration broke somehow
// (e.g. pull secret expired)
// and also avoids provisioning a new node, only to update it not long thereafter.
var currConf string
if mp.Status.UpdatedMachineCount > 0 {
currConf = mp.Spec.Configuration.Name
} else {
currConf = mp.Status.Configuration.Name
}

mc, err := cs.machineClient.MachineConfigs().Get(context.TODO(), currConf, metav1.GetOptions{})
if err != nil {
Expand Down