Skip to content

Commit

Permalink
Sync odh-2.12 (#1011)
Browse files Browse the repository at this point in the history
* Update version to v2.12.0 (#1007)

* upgrade: retry if default DSCI creation fails (#1008)

After removing leader election, operator fails to start if it is
instructed to create default DSCI. Looks like webhook is not ready
by the time:

```
create default DSCI CR.
{"level":"error","ts":"2024-05-13T09:25:58Z","logger":"setup","msg":"unable to create initial setup for the operator","error":"Internal error occurred: failed calling webhook \"operator.opendatahub.io\": failed to call webhook: Post \"https://opendatahub-operator-controller-manager-service.oo-2ts9m.svc:443/validate-opendatahub-io-v1?timeout=10s\": no endpoints available for service \"opendatahub-operator-controller-manager-service\"","stacktrace":"main.main.func1\n\t/workspace/main.go:200\nsigs.k8s.io/controller-runtime/pkg/manager.RunnableFunc.Start\n\t/remote-source/operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/manager/manager.go:336\nsigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1\n\t/remote-source/operator/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/manager/runnable_group.go:219"}
```
Leader election added some delay.

The problem does not happen in default configuration since it
explicitly disables DSCI creation in the manifests:

```
       containers:
       - command:
         - /manager
         env:
           - name: DISABLE_DSC_CONFIG
             value: 'true'
         args:
         - --operator-name=opendatahub
         image: controller:latest
```

Make a wrapper function cluster.CreateWithRetry for client.Object
creation with timeout. Use hardcoded 5s interval, just seems
reasonable, and timeout in minutes as the parameter.

It requires disable linter nilerr since for the polling function
error in creation is a valid condition, something the function wait
to disappear.

Fixes: 3610b0b ("feat: remove leader election for operator (#1000)")

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

---------

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Co-authored-by: Yauheni Kaliuta <ykaliuta@redhat.com>
  • Loading branch information
VaishnaviHire and ykaliuta authored May 14, 2024
1 parent 1abe316 commit c4ffebb
Show file tree
Hide file tree
Showing 5 changed files with 23 additions and 10 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# To re-generate a bundle for another specific version without changing the standard setup, you can:
# - use the VERSION as arg of the bundle target (e.g make bundle VERSION=0.0.2)
# - use environment variables to overwrite this value (e.g export VERSION=0.0.2)
VERSION ?= 2.10.1
VERSION ?= 2.12.0
# IMAGE_TAG_BASE defines the opendatahub.io namespace and part of the image name for remote images.
# This variable is used to construct full image tags for bundle and catalog images.
#
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,14 +101,14 @@ metadata:
capabilities: Full Lifecycle
categories: AI/Machine Learning, Big Data
certified: "False"
containerImage: quay.io/opendatahub/opendatahub-operator:v2.10.0
containerImage: quay.io/opendatahub/opendatahub-operator:v2.12.0
createdAt: "2024-4-22T00:00:00Z"
olm.skipRange: '>=1.0.0 <2.11.0'
olm.skipRange: '>=1.0.0 <2.12.0'
operators.operatorframework.io/builder: operator-sdk-v1.24.1
operators.operatorframework.io/internal-objects: '[dscinitialization.opendatahub.io]'
operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
repository: https://github.com/opendatahub-io/opendatahub-operator
name: opendatahub-operator.v2.10.1
name: opendatahub-operator.v2.12.0
namespace: placeholder
spec:
apiservicedefinitions: {}
Expand Down Expand Up @@ -1763,7 +1763,7 @@ spec:
selector:
matchLabels:
component: opendatahub-operator
version: 2.10.1
version: 2.12.0
webhookdefinitions:
- admissionReviewVersions:
- v1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ metadata:
capabilities: Full Lifecycle
categories: AI/Machine Learning, Big Data
certified: "False"
containerImage: quay.io/opendatahub/opendatahub-operator:v2.10.0
containerImage: quay.io/opendatahub/opendatahub-operator:v2.12.0
createdAt: "2024-4-22T00:00:00Z"
olm.skipRange: '>=1.0.0 <2.11.0'
olm.skipRange: '>=1.0.0 <2.12.0'
operators.operatorframework.io/internal-objects: '[dscinitialization.opendatahub.io]'
repository: https://github.com/opendatahub-io/opendatahub-operator
name: opendatahub-operator.v2.11.0
name: opendatahub-operator.v2.12.0
namespace: placeholder
spec:
apiservicedefinitions: {}
Expand Down Expand Up @@ -105,4 +105,4 @@ spec:
selector:
matchLabels:
component: opendatahub-operator
version: 2.11.0
version: 2.12.0
13 changes: 13 additions & 0 deletions pkg/cluster/resources.go
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,16 @@ func WaitForDeploymentAvailable(ctx context.Context, c client.Client, componentN
return true, nil
})
}

func CreateWithRetry(ctx context.Context, cli client.Client, obj client.Object, timeoutMin int) error {
interval := time.Second * 5 // arbitrary value
timeout := time.Duration(timeoutMin) * time.Minute

return wait.PollUntilContextTimeout(ctx, interval, timeout, true, func(ctx context.Context) (bool, error) {
err := cli.Create(ctx, obj)
if err != nil {
return false, nil //nolint:nilerr
}
return true, nil
})
}
2 changes: 1 addition & 1 deletion pkg/upgrade/upgrade.go
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ func CreateDefaultDSCI(ctx context.Context, cli client.Client, _ cluster.Platfor
return nil
case len(instances.Items) == 0:
fmt.Println("create default DSCI CR.")
err := cli.Create(ctx, defaultDsci)
err := cluster.CreateWithRetry(ctx, cli, defaultDsci, 1) // 1 min timeout
if err != nil {
return err
}
Expand Down

0 comments on commit c4ffebb

Please sign in to comment.