Skip to content

Commit

Permalink
Merge incubation branch commits (#1422)
Browse files Browse the repository at this point in the history
* cluster_config: init cluster variables on startup (#1059)

* cluster_config: move type definitions to the beginning of the file

Just a bit more clarity

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* cluster_config: init cluster variables on startup

Cluster configuration is supposed to be the same during operator pod
lifetime. There is no point to detect it on every reconcile cycle
keeping in mind that it can causes many api requests.

Do the initialization on startup. Keep GetOperatorNamespace()
returning error since it defines some logic in the DSCInitialization
reconciler.

Automatically called init() won't work here due to need to check
error of the initialization.

Wrap logger into context and use it in the Init() like
controller-runtime does [1][2].

Save root context without the logger for mgr.Start since "setup"
logger does not fit normal manager's work.

Leave GetDomain() as is since it uses OpenshiftIngress configuration
which is created when DSCInitialization instantiated.

Log cluster configuration on success from Init, so remove platform
logging from main.

[1] https://github.com/kubernetes-sigs/controller-runtime/blob/38546806f2faf5973e3321a7bd5bb3afdbb5767d/pkg/internal/controller/controller.go#L297
[2] https://github.com/kubernetes-sigs/controller-runtime/blob/38546806f2faf5973e3321a7bd5bb3afdbb5767d/pkg/internal/controller/controller.go#L111

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* cluster: do not return error from GetRelease

GetRelease return values are defined at startup, the error checked
in main, so no point to return error anymore.

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

---------

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* components: move params.env image updating to Init stage (#1191)

* components, main: add Component Init method

Add Init() method to the Component interface and call it from main on
startup.

Will be used to move startup-time code from ReconcileComponent
(like adjusting params.env).

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* components: move params.env image updating to Init stage

Jira: https://issues.redhat.com/browse/RHOAIENG-11592

Image names in environment are not supposed to be changed during
runtime of the operator, so it makes sense to update them only on
startup.

If manifests are overriden by DevFlags, the DevFlags' version will
be used.

The change is straight forward for most of the components where only
images are updated and params.env is located in the kustomize root
directory, but some components (dashboard, ray, codeflare,
modelregistry) also update some extra parameters. For them image
part only is moved to Init since other updates require runtime DSCI
information.

The patch also changes logic for ray, codeflare, and modelregistry
in this regard to update non-image parameters regardless of DevFlags
like it was changed in dashboard recently.

The DevFlags functionality brings some concerns:

- For most components the code is written such a way that as soon as
DevFlags supplied the global path variables are changed and never
reverted back to the defaults. For some (dashboard, trustyai) there
is (still global) OverridePath/entryPath pair and manifests reverted
to the default, BUT there is no logic of transition.

- codeflare: when manifests are overridden namespace param is
updated in the hardcoded (stock) path;

This logic is preserved.

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

---------

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* update: remove webhook service from bundle (#1275)

- we do not need it in bundle, CSV auto generate one during installation
- if we install operator via OLM, webhook service still get used from config/webhook/service.yaml

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: add validation on application and monitoring namespace in DSCI (#1263)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* logger: add zap command line switches (#1280)

Allow to tune preconfigured by --log-mode zap backend with standard
zap command line switches from controller-runtime (zap-devel,
zap-encoder, zap-log-level, zap-stacktrace-level,
zap-time-encoding)[1].

This brings flexibility in logger setup for development environments
first of all.

The patch does not change default logger setup and does not change
DSCI override functionality.

[1] https://sdk.operatorframework.io/docs/building-operators/golang/references/logging

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* Modify Unit Tests GitHub Actions workflow to get code coverage test reports (#1279)

* Create codecov.yml

* Added to run test coverage also on PRs

* Removing trailing ]

* Update .github/workflows/codecov.yml

Co-authored-by: Wen Zhou <wenzhou@redhat.com>

* Removed go mod install dependency

* Consolidated codecov workflow into unit tests workflow

---------

Co-authored-by: Wen Zhou <wenzhou@redhat.com>

* webhook: move initialization inside of the module (#1284)

Add webhook.Init() function and hide webhook setup inside of the
module. It will make it possible to replace Init with a NOP (no
operation) with conditional compilation for no-webhook build.

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* feat: pass platform from env variable or fall back to use old logic (#1252)

* feat: pass platform from env variables or fall back to use old logic

- introduce new env var ODH_PLATFORM_TYPE, value set during build time
  - if value not match, fall back to detect managed then self-managed
- introduce new env var OPERATOR_RELEASE_VERSION, value also set during build time
  - if value is empty, fall back to use old way from CSV to read version or use 0.0.0
- add support from makefile
  - use envstubst to replace version

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: remove release version in the change

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* fix: update release version in DSCI and DSC .status for upgrade case (#1287)

- DSCI: if current version is not matching, update it
- DSC: in both reconcile pass and fail case, update it

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* Update version to 2.19.0 (#1291)

Co-authored-by: VaishnaviHire <17230536+VaishnaviHire@users.noreply.github.com>

* Makefile, webhook: add run-nowebhook (#1286)

Make it possible to compile operator without webhook enabled (with
-tags nowebhook). Create a stub webhook.Init() function for that.

Add run-nowebhook target to run webhook locally. It requires
`make install` to be executed on the cluster beforehand.

Since it repeats `make run`, move the command to a variable.
Also use variable RUN_ARGS for operator arguments. It makes it
possible to override them from the command line.

In VSCode it is possible to debug it with the following example
launch.json configuration:

```
        {
            "name": "Debug Operator No Webhook",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "main.go",
            "buildFlags": [
                "-tags", "nowebhook"
            ],
            "env": {
                "OPERATOR_NAMESPACE": "opendatahub-operator-system",
                "DEFAULT_MANIFESTS_PATH": "./opt/manifests"
            },
            "args": [
                "--log-mode=devel"
            ],
            "cwd": "${workspaceFolder}",
        }
```

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* update: use namespace dyniamically from operator env than hardcode value (#1298)

- thses are only needed when it is downstream speicific cases

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* webhook: switch to contextual logger (#1295)

Use k8s contextual logger instead of own. Comparing to other parts
of the operator webhook is a bit special since it serves own
endpoints and has context independent from controllers.

Include more info (operation), just to get more details. Do not add
kind since it's clear from "resource" field.

Since controller-framework adds "admission" to the name, use own
LogConstructor with own base logger for the framework's
DefaultLogConstructor.

Add name locally to distinguish from framework's messages.

Add Name field to the structures to avoid copying string literal for
both WithName() and WithValues().

The output changes and it looks like

```
{"level":"info","ts":"2024-10-11T05:17:20Z","logger":"ValidatingWebhook","msg":"Validation request","object":{"name":"default-dsci"},"namespace":"","name":"default-dsci","resource":{"group":"dscinitialization.opendatahub.io","version":"v1","resource":"dscinitializations"},"user":"kube:admin","requestID":"e5bf3768-6faa-4e14-9004-e54ee84ad8b7","webhook":"ValidatingWebhook","operation":"CREATE"}
```

or for the defaulter:

```
{"level":"info","ts":"2024-10-11T04:50:48Z","logger":"DefaultingWebhook","msg":"Defaulting DSC","object":{"name":"default-dsc"},"namespace":"","name":"default-dsc","resource":{"group":"datasciencecluster.opendatahub.io","version":"v1","resource":"datascienceclusters"},"user":"kube:admin","requestID":"c9213ff3-80ee-40c0-9f15-12188dece68e","webhook":"DefaultingWebhook"}
```

(the messages are not from the current codebase, was added for demo only)

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* chore: use client.OjectKeyFromObject in client.Get() (#1301)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* (fix): Change on trigger event to `pull_request_target` in the "Check config and readme updates / Ensure generated files are included (pull_request)" action to fix "Resource not accessible by integration" error while running the action (#1296)

Signed-off-by: AJAY JAGANATHAN <ajaganat@redhat.com>

* feat: Operator disable create usergroup if detect user enabled external auth (#1297)

* feat: Operator disable create usergroup if detect users enabled external auth

- use internal Authentication CR Type "" or IntegratedOAuth to indicate if Operator should create usergroup  or not
  CRD has validation to only allow "IntegratedOAuth", "", "None" or "OIDC"
- only grant "get, watch , list" as least permission
- remove duplicated rbac for "ingress", has been defined in other places
- add object into cache
- add CRD into unit-test
- add unit-test: since we dont have auth CR, it should not create usergroup

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: review comments

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: review comments use const

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* components, logger: use contextual logging approach (#1253)

Switch ReconcileComponent from passing logger explicitly to wrapping
it into context[1][2]

Makes one parameter less to pass and will allow called utilities to
report component context where they are called from.

No user or logging format impact until utilities takes contextual
logging in use.

[1] https://www.kubernetes.dev/blog/2022/05/25/contextual-logging/
[2] https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/3077-contextual-logging/README.md

Credits-To: Bartosz Majsak bartosz.majsak@gmail.com

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* logger, controllers: use common logging level, INFO by default (#1289)

Remove increasing logging level for controllers (it was also passed
to components if not overridden from DSCI) since:

- it made logging inconsistent. The base contoller runtime logger is
set up with INFO level for all log modes, so when controllers are
configured for Error level, user sees INFO messages from both
controller-runtime and other parts of operator which use
controller-runtime's logger directly;

- since the base logger is configured for INFO, there is no
difference in levels between "default" and "devel". Having levels 1
and 2 there is misleading.

Update documentation.

This patch changes default logging, former filtered Info messages
are displayed now.

There is no _big_ difference in practice since currently the log is
anyway full of info messages from parts which do not use
reconciler's logger, like:

{"level":"info","ts":"2024-10-09T13:23:11Z","msg":"waiting for 1 deployment to be ready for dashboard"}

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* removed lint and typo fixes (#1302)

* fix: to make the unset env variable in CSV work with fallback (#1306)

- previous code, will run into opendathaub case if env is not set

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* controllers: switch to k8s contextual logger (#1308)

Jira: https://issues.redhat.com/browse/RHOAIENG-14096

This is a squashed commit of the following patches:

d012b67 ("controllers: switch to k8s contextual logger")
9880530 ("logger: blindly convert ctrl.Log users to contextual")

- controllers: switch to k8s contextual logger

Remove own logger from controllers' reconcilers and switch to k8s
contextual logger instead [1].

Use contextual logger for SecretGeneratorReconciler and
CertConfigmapGeneratorReconciler setups as well.

Add name to the logger coming from the framework. It will contains
"controller" field already, and like in webhook with the name it's
easy to distinguish framework and operator messages.

- logger: blindly convert ctrl.Log users to contextual

All the users should have proper context now. The log level changes
will affect it as well.

[1] https://www.kubernetes.dev/blog/2022/05/25/contextual-logging/

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* Tidy up run-nowebhook recipe & make clean PHONY (#1310)

* chore: Tidy run-nowebhook recipe

The suggestion to tidy up the run-nowebhook recipe comes from this
conversation:
https://github.com/opendatahub-io/opendatahub-operator/pull/1304/files#r1806731373

* chore: Make `clean` a PHONY target

I believe `clean` should be a PHONY target, since it doesn't create a
file called `clean`

* refactor the secret generator controller (#1311)

The commit is meant to:
- make the code easier to understand, reducing complexity caused by
  nested if/else and error conditions (align happy path to the left)
- remove shadowed error vars to avoid reporting misleading errors
- add some basic unit test for the reconcile loop

* update: rename variables rhods to rhoai (#1313)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: ajaypratap003 <ajay.pratap233@gmail.com>

* logger: add logLevel to DSCI devFlags (#1307)

Jira: https://issues.redhat.com/browse/RHOAIENG-14713

This is a squashed commit of the patchset (original ids):

959f84d ("logger: import controller-runtime zap as ctrlzap")
b8d9cde ("logger: add logLevel to DSCI devFlags")
2d3efe2 ("components, logger: always use controller's logger")
1ef9266 ("components, logger: use one function for ctx creation")

- logger: import controller-runtime zap as ctrlzap

To avoid patch polluting with the next changes where uber's zap is
imported as `zap`.

- logger: add logLevel to DSCI devFlags

Allow to override global zap log level from DSCI devFlags. Accepts
the same values as to `--zap-log-level` command line switch.

Use zap.AtomicLevel type to store loglevel. Locally use atomic to
store its value. We must store the structure itsel since command
line flags parser (ctrlzap.(*levelFlag).Set()) stores the structure
there. In its order ctrlzap.addDefault() stores pointers,
newOptions() always sets the level to avoid setting it by
addDefaults(). Otherwise SetLevel should handle both pointer and the
structure as logLevel atomic.Value.

It is ok to store structure of zap.AtomicLevel since it itself
contains a pointer to the atomic.Int32 and common level value is
changed then.

The parsing code is modified version of the one from
controller-runtime/pkg/log/zap/flag.go.

Deprecate DSCI.DevFlags.logmode.

- components, logger: always use controller's logger

Since the log level is overridable with its own field of devFlags,
do not use logmode anymore. It was used to create own logger with
own zap backend in case if devFlags exist.

Just add name and value to the existing logger instead.

Rename NewLoggerWithOptions back to NewLogger since former NewLogger
is removed.

Change component logger name. "DSC.Component" is redundant. It was
usuful when component's logger was created from scratch, but now
when it is always based on the reconciler's one, it's clear that it
is a part of DSC.

- components, logger: use one function for ctx creation

Move both logger and component creation to one function.

Just a cleanup.

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* secret-generator controller: avoid reporting 'Secret not found' error in reconcile (#1312)

* update: remove dsp with v1(tekton)backend related code (#1281)

* update: remove dsp with v1(tekton)backend related code

- images
- tekton rbac
- descriptions

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: Humair Khan <HumairAK@users.noreply.github.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: Humair Khan <HumairAK@users.noreply.github.com>

* update: remove two SA which does not seem valid (#1254)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* Update version to 2.20.0 (#1325)

Co-authored-by: VaishnaviHire <17230536+VaishnaviHire@users.noreply.github.com>

* fix: wrong type (#1322)

-  we should watch IngressController from operator.openshift.io, not the k8s ingress
-  this matched the cache we added in main

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update golangci-lint to v1.61.0 (#1327)

* chore(linter): update golangci-lint to v1.61.0

* chore(linter): fix findings

* chore: remove duplicated image in modelmesh (#1336)

- this only apply for deplenet odh-model-controller
- no need to set it in the default map, a waste

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* fix: ensure input CAbundle always end with a newline (#1339)

* fix: ensure input CAbundle always end with a newline

- missing newline will cause problem when inject data into configmap
- this happens when user use kustomize with their own CA for DSCI, it set to use |- not |
- fix lint

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* update: better way to add newline

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* fix: add check to remove old kube-rbac-proxy container in modelregistry operator, fixes RHOAIENG-15328 (#1341)

* fix: add check to remove old kube-rbac-proxy container in modelregistry operator, fixes RHOAIENG-15328

Signed-off-by: Dhiraj Bokde <dhirajsb@users.noreply.github.com>

* update: remove check in ModelReg code, add it to upgrade logic

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Dhiraj Bokde <dhirajsb@users.noreply.github.com>
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: Wen Zhou <wenzhou@redhat.com>

* fix: wrong name of variable and func lead to always return self-managed (#1362)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* fix: add extra check on servicemesh CRD and svc before proceeding create SMCP CR (#1359)

* feat: add extra check on servicemesh CRD and service before proceeding create SMCP CR

- when we install operator we might have a race condition:
    - ossm is not ready to server
    - dsci already created smcp CR
    - since we do not know the version of smcp to use
    - ossm webhook can run into conversion problem since no version specify in smcp CR
   -  test: add more negative test
       - when CRD does not exist
       - when service does not exist

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* logger: set loglevel from ZAP_LOG_LEVEL environment variable (#1344)

Add possibility to set loglevel with the environment. The name is
used used already.

Command line switch has priority over environment.

[1] https://github.com/opendatahub-io/data-science-pipelines-operator/blob/main/config/base/params.env#L13

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* fix: added a filter to evicted pods when checking for ready status (#1334)

* fix: added a filter to evicted pods

Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>

---------

Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>

* Task/NVPE-32: Cleanup NIM integration tech preview resources (#1369)

* chore: cleanup nim integration tech preview resources

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>

* chore: apply suggestions from code review

Co-authored-by: Wen Zhou <wenzhou@redhat.com>

* chore: addressed pr reviews, better logging

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>

* chore: set nim cleanup for minors 14 and 15

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>

* chore: nim tp cleanup should remove api key secret

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>

* chore: addresed pr reviews, rewrite cleanup func

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>

---------

Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>
Co-authored-by: Wen Zhou <wenzhou@redhat.com>

* update: owners (#1376)

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

* Updated codecov version to 4.6.0 (#1378)

* Update version to 2.21.0 (#1379)

Co-authored-by: VaishnaviHire <17230536+VaishnaviHire@users.noreply.github.com>

* Dockerfile, Makefile: make CGO_ENABLED configurable (#1382)

The commit
a107703 ("feat(fips): enable GO_ENABLED in build (#1001)")

enabled CGO which makes problems for builders on non-x86 platforms.

Make it as an in the Dockerfile keeping default the same (enabled),
but make it possible to override with either environment
(`export CGO_ENABLED=0`) or make (`make CGO_ENABLED=0 image-build`)

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* Makefile: make USE_LOCAL overridable (#1384)

Get USE_LOCAL image build flag from makefile variable to make it
overridable with `make USE_LOCAL=true image`

Do not allow to get its value for environment be default due to
pretty generic name.

This is shorter than the old recommendation of overriding
IMAGE_BUILD_FLAGS. And since now CGO_ENABLED is also a flag, does
not mess up with it.

* USE_LOCAL=true uses existing manifests from opt/manifests for the
produced image without downloading them with get_all_manifests.sh
making it possible to both save time and make local amendments.

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* Dockerfile: merges manifests builder stages to one (#1381)

We can combine two build stages into one, as there is no need to
always build both images (not done by podman) to only then decide
from which one we want to copy manifests to the target
image. Instead manifests stage will either copy local manifests or
fetches using the script based on USE_LOCAL argument.

Move USE_LOCAL and OVERWIRTE_MANIFESTS args under FROM since args
have scope of the FROM they are declared in.

It requires opt/manifests directory to exist, but since it's a part
of git repo, it's fine.

Original patch from: Bartosz Majsak <bartosz.majsak@gmail.com> [1]

[1] #773

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* Use incubating for odh-model-controller manifests (#1393)

* Use incubating for odh-model-controller manifests

* Use kserve v0.14 branch for manifests

* add enablement flag for nim (#1330)

* add enablement flag for nim

* add generated files

* add api generated file

* revert image for csv

* update NIM with its own spec

* create nim struct remove from serverless

* updated generated files

* update nim to default removed

* update image to correct var

* update to managed as default

* add nim-state to params.env

* fix NVIDIA typo

* resolve conflict

* add NIM flag check to model mesh

* fix condtional for standard practice nim flag

* fix to conditional for standards

* update typo for error output nim flag kserve

* Fix deploy to dependent path

* update doc/manifests references for nim flag

* fix image typo csv

* update if on nim-state

* Remove adelton from the platform owners list. (#1402)

* Typo fixes "manifests" (#1409)

* doc: remove -e from make invocation examples (#1408)

The intention in the examples is to override make variables which is
done with setting the variable in the command line, while -e switch
does not get argument and changes make to a mode when environment
takes precedence over makefile settings[1][2]

Short examples:

```
 % cat Makefile
VAR = var
DEFVAR ?= defvar
all:
        @echo "VAR $(VAR), origin $(origin VAR)"
        @echo "DEFVAR $(DEFVAR), origin $(origin DEFVAR)"

% make
VAR var, origin file
DEFVAR defvar, origin file

% make VAR=cmd DEFVAR=cmd
VAR cmd, origin command line
DEFVAR cmd, origin command line

% VAR=env DEFVAR=env make
VAR var, origin file
DEFVAR env, origin environment

% VAR=env DEFVAR=env make -e
VAR env, origin environment override
DEFVAR env, origin environment

% VAR=env DEFVAR=env make -e VAR=cmd DEFVAR=cmd
VAR cmd, origin command line
DEFVAR cmd, origin command line
```

[1] https://www.gnu.org/software/make/manual/html_node/Values.html
[2] https://www.gnu.org/software/make/manual/html_node/Environment.html

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>

* (fix): Avoid using pull_request_target event trigger. (#1417)

Using pull_request_target event trigger results in security vulnerabilities explained here: https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/

Signed-off-by: AJAY JAGANATHAN <ajaganat@redhat.com>

* fix:

- missing authentications rbca
- merge problem
- duplicated API definition
- lint

Signed-off-by: Wen Zhou <wenzhou@redhat.com>

---------

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Signed-off-by: AJAY JAGANATHAN <ajaganat@redhat.com>
Signed-off-by: Dhiraj Bokde <dhirajsb@users.noreply.github.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Tomer Figenblat <tfigenbl@redhat.com>
Co-authored-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Co-authored-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: Adrián Sanz Gómiz <100594859+asanzgom@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: VaishnaviHire <17230536+VaishnaviHire@users.noreply.github.com>
Co-authored-by: Ajay Jaganathan <36824134+AjayJagan@users.noreply.github.com>
Co-authored-by: Gerard Ryan <git@grdryn.xyz>
Co-authored-by: Luca Burgazzoli <lburgazzoli@users.noreply.github.com>
Co-authored-by: ajaypratap003 <ajay.pratap233@gmail.com>
Co-authored-by: Marek Laššák <73712209+mlassak@users.noreply.github.com>
Co-authored-by: Humair Khan <HumairAK@users.noreply.github.com>
Co-authored-by: Dhiraj Bokde <dhirajsb@users.noreply.github.com>
Co-authored-by: Alessio Pragliola <83355398+Al-Pragliola@users.noreply.github.com>
Co-authored-by: Tomer Figenblat <tomer.figenblat@gmail.com>
Co-authored-by: Hannah DeFazio <h2defazio@gmail.com>
Co-authored-by: Marcus Trujillo <42344046+trujillm@users.noreply.github.com>
Co-authored-by: Jan Pazdziora <jan.pazdziora@code.adelton.com>
Co-authored-by: Christian Zaccaria <73656840+ChristianZaccaria@users.noreply.github.com>
  • Loading branch information
19 people authored Dec 6, 2024
1 parent 6f59682 commit b27759b
Show file tree
Hide file tree
Showing 57 changed files with 1,034 additions and 483 deletions.
44 changes: 18 additions & 26 deletions .github/workflows/check-file-updates.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,42 +3,34 @@ on:
pull_request:
jobs:
file-updates:
permissions:
pull-requests: write
name: Ensure generated files are included
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Generate files
id: generate-files
- uses: actions/checkout@v4.2.2
with:
ref: ${{github.event.pull_request.head.ref}}
repository: ${{github.event.pull_request.head.repo.full_name}}
- name: Save the PR number for artifact upload
run: |
CMD="make generate manifests api-docs"
$CMD
echo "CMD=$CMD" >> $GITHUB_OUTPUT
echo ${{ github.event.number }} > pr_number.txt
- name: Upload the PR number as artifact
id: artifact-upload
uses: actions/upload-artifact@v4
with:
name: pr_number
path: ./pr_number.txt
retention-days: 1 # This will delete the generated artifacts every day.
- name: Generate files
run: make generate manifests api-docs
- name: Ensure generated files are up-to-date
id: check_generated_files
run : |
rm ./pr_number.txt # remove the pr_number.txt before checking "git status", to have correct assessment of the changed files.
if [[ -n $(git status -s) ]]
then
echo "Generated files have been missed in the PR"
git diff
echo "missing_generated_files=true" >> $GITHUB_OUTPUT
exit 1
else
echo "No new files to commit"
echo "missing_generated_files=false" >> $GITHUB_OUTPUT
fi
- name: Report issue in PR
if: ${{ steps.check_generated_files.outputs.missing_generated_files == 'true' }}
uses: thollander/actions-comment-pull-request@v2
with:
message: |
## This PR can't be merged just yet 😢
Please run `${{ steps.generate-files.outputs.CMD }}` and commit the changes.
For more info: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
- name: Print git status and fail pr
if: ${{ steps.check_generated_files.outputs.missing_generated_files == 'true' }}
run: |
git status
exit 1
fi
59 changes: 59 additions & 0 deletions .github/workflows/comment-on-pr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: Comment on pr
on:
workflow_run:
workflows: ["Check config and readme updates"]
types:
- completed
jobs:
download-artifact-data:
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
runs-on: ubuntu-latest
outputs:
pr_number: ${{ steps.artifact-data.outputs.pr_number }}
steps:
- name: Download artifact
id: artifact-download
uses: actions/github-script@v7
with:
script: |
let allArtifacts = await github.rest.actions.listWorkflowRunArtifacts({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: context.payload.workflow_run.id,
});
let matchArtifact = allArtifacts.data.artifacts.filter((artifact) => {
return artifact.name == "pr_number"
})[0];
let download = await github.rest.actions.downloadArtifact({
owner: context.repo.owner,
repo: context.repo.repo,
artifact_id: matchArtifact.id,
archive_format: 'zip',
});
let fs = require('fs');
fs.writeFileSync(`${process.env.GITHUB_WORKSPACE}/pr_number.zip`, Buffer.from(download.data));
- name: Unzip artifact
run: unzip pr_number.zip
- name: Extract data
id: artifact-data
run: |
echo "pr_number=$(head -n 1 pr_number.txt)" >> $GITHUB_OUTPUT
comment-on-pr:
needs:
- download-artifact-data
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Report issue in PR
uses: thollander/actions-comment-pull-request@v3.0.1
with:
message: |
## This PR can't be merged just yet 😢
Please run `make generate manifests api-docs` and commit the changes.
For more info: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.event.workflow_run.id }}
pr-number: ${{ needs.download-artifact-data.outputs.pr_number }}
2 changes: 1 addition & 1 deletion .github/workflows/linter.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ jobs:
- name: golangci-lint
uses: golangci/golangci-lint-action@v6
with:
version: v1.60.2
version: v1.61.0
args: --timeout 5m0s
2 changes: 1 addition & 1 deletion .github/workflows/unit-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,6 @@ jobs:
run: make unit-test

- name: Upload results to Codecov
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v4.6.0
with:
token: ${{ secrets.CODECOV_TOKEN }}
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Issues are tracked using [Jira](https://issues.redhat.com/secure/RapidBoard.jspa
1. **Fork the Repository:** Create your own fork of the repository to work on your changes.
2. **Create a Branch:** Create your own branch to include changes for the feature or a bug fix off of `incubation` branch.
3. **Work on Your Changes:** Commit often, and ensure your code adheres to these [Code Style Guidelines](#code-style-guidelines) and passes all the [quality gates](#quality-gates) for the operator.
4. **Testing:** Make sure your code passes all the tests, including any new tests you've added.
4. **Testing:** Make sure your code passes all the tests, including any new tests you've added. And that your changes do not decrease the test coverage as shown on report. Every new feature should come with unit tests that cover that new part of the code.

### Open a Pull Request:

Expand Down
29 changes: 12 additions & 17 deletions Dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,27 +1,22 @@
# Build the manager binary
ARG GOLANG_VERSION=1.21
ARG USE_LOCAL=false
ARG OVERWRITE_MANIFESTS=""

################################################################################
FROM registry.access.redhat.com/ubi8/go-toolset:$GOLANG_VERSION as builder_local_false
ARG OVERWRITE_MANIFESTS
# Get all manifests from remote git repo to builder_local_false by script
FROM registry.access.redhat.com/ubi8/toolbox as manifests
ARG USE_LOCAL=false
ARG OVERWRITE_MANIFESTS=""
USER root
WORKDIR /
COPY get_all_manifests.sh get_all_manifests.sh
RUN ./get_all_manifests.sh ${OVERWRITE_MANIFESTS}

################################################################################
FROM registry.access.redhat.com/ubi8/go-toolset:$GOLANG_VERSION as builder_local_true
# Get all manifests from local to builder_local_true
USER root
WORKDIR /opt
# copy local manifests to build
COPY opt/manifests/ /opt/manifests/
COPY get_all_manifests.sh get_all_manifests.sh
RUN if [ "${USE_LOCAL}" != "true" ]; then \
rm -rf /opt/manifests/*; \
./get_all_manifests.sh ${OVERWRITE_MANIFESTS}; \
fi

################################################################################
FROM builder_local_${USE_LOCAL} as builder
FROM registry.access.redhat.com/ubi8/go-toolset:$GOLANG_VERSION as builder
ARG CGO_ENABLED=1
USER root
WORKDIR /workspace
# Copy the Go Modules manifests
Expand All @@ -39,13 +34,13 @@ COPY main.go main.go
COPY pkg/ pkg/

# Build
RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -a -o manager main.go
RUN CGO_ENABLED=${CGO_ENABLED} GOOS=linux GOARCH=amd64 go build -a -o manager main.go

################################################################################
FROM registry.access.redhat.com/ubi8/ubi-minimal:latest
WORKDIR /
COPY --from=builder /workspace/manager .
COPY --chown=1001:0 --from=builder /opt/manifests /opt/manifests
COPY --chown=1001:0 --from=manifests /opt/manifests /opt/manifests
# Recursive change all files
RUN chown -R 1001:0 /opt/manifests &&\
chmod -R g=u /opt/manifests
Expand Down
13 changes: 9 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# To re-generate a bundle for another specific version without changing the standard setup, you can:
# - use the VERSION as arg of the bundle target (e.g make bundle VERSION=0.0.2)
# - use environment variables to overwrite this value (e.g export VERSION=0.0.2)
VERSION ?= 2.19.0
VERSION ?= 2.21.0
# IMAGE_TAG_BASE defines the opendatahub.io namespace and part of the image name for remote images.
# This variable is used to construct full image tags for bundle and catalog images.
#
Expand All @@ -22,7 +22,8 @@ BUNDLE_IMG ?= $(IMAGE_TAG_BASE)-bundle:v$(VERSION)
IMAGE_BUILDER ?= podman
OPERATOR_NAMESPACE ?= opendatahub-operator-system
DEFAULT_MANIFESTS_PATH ?= opt/manifests

CGO_ENABLED ?= 1
USE_LOCAL = false

CHANNELS="fast"
# CHANNELS define the bundle channels used in the bundle.
Expand Down Expand Up @@ -68,7 +69,7 @@ YQ ?= $(LOCALBIN)/yq
KUSTOMIZE_VERSION ?= v5.0.2
CONTROLLER_GEN_VERSION ?= v0.16.1
OPERATOR_SDK_VERSION ?= v1.31.0
GOLANGCI_LINT_VERSION ?= v1.60.2
GOLANGCI_LINT_VERSION ?= v1.61.0
YQ_VERSION ?= v4.12.2
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION = 1.31.0
Expand All @@ -94,7 +95,8 @@ E2E_TEST_FLAGS = -timeout 25m
# Default image-build is to not use local odh-manifests folder
# set to "true" to use local instead
# see target "image-build"
IMAGE_BUILD_FLAGS ?= --build-arg USE_LOCAL=false
IMAGE_BUILD_FLAGS ?= --build-arg USE_LOCAL=$(USE_LOCAL)
IMAGE_BUILD_FLAGS += --build-arg CGO_ENABLED=$(CGO_ENABLED)

# Read any custom variables overrides from a local.mk file. This will only be read if it exists in the
# same directory as this Makefile. Variables can be specified in the standard format supported by
Expand Down Expand Up @@ -189,9 +191,11 @@ run: manifests generate fmt vet ## Run a controller from your host.

.PHONY: run-nowebhook
run-nowebhook: GO_RUN_ARGS += -tags nowebhook

run-nowebhook: manifests generate fmt vet ## Run a controller from your host without webhook enabled
$(GO_RUN_MAIN)


.PHONY: image-build
image-build: # unit-test ## Build image with the manager.
$(IMAGE_BUILDER) build --no-cache -f Dockerfiles/Dockerfile ${IMAGE_BUILD_FLAGS} -t $(IMG) .
Expand Down Expand Up @@ -381,6 +385,7 @@ CLEANFILES += cover.out
e2e-test: ## Run e2e tests for the controller
go test ./tests/e2e/ -run ^TestOdhOperator -v --operator-namespace=${OPERATOR_NAMESPACE} ${E2E_TEST_FLAGS}

.PHONY: clean
clean: $(GOLANGCI_LINT)
$(GOLANGCI_LINT) cache clean
chmod u+w -R $(LOCALBIN) # envtest makes its dir RO
Expand Down
4 changes: 2 additions & 2 deletions OWNERS_ALIASES
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
aliases:
platform:
- adelton
- AjayJagan
- ajaypratap003
- asanzgom
- biswassri
- CFSNM
Expand All @@ -15,8 +13,10 @@ aliases:
- lphiri
- MarianMacik
- mattmahoneyrh
- robotmaxtron
- Sara4994
- StevenTobin
- ugiordan
- VaishnaviHire
- ykaliuta
- zdtsw
60 changes: 25 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ and configure these applications.
- [Deployment](#deployment)
- [Test with customized manifests](#test-with-customized-manifests)
- [Update API docs](#update-api-docs)
- [Enabled logging](#enabled-logging)
- [Example DSCInitialization](#example-dscinitialization)
- [Example DataScienceCluster](#example-datasciencecluster)
- [Run functional Tests](#run-functional-tests)
Expand Down Expand Up @@ -134,21 +135,21 @@ make image-build
By default, building an image without any local changes(as a clean build)
This is what the production build system is doing.

In order to build an image with local `opt/manifests` folder, to set `IMAGE_BUILD_FLAGS ="--build-arg USE_LOCAL=true"` in make.
e.g `make image-build -e IMAGE_BUILD_FLAGS="--build-arg USE_LOCAL=true"`
In order to build an image with local `opt/manifests` folder set `USE_LOCAL` make variable to `true`
e.g `make image-build USE_LOCAL=true"`

#### Build Image

- Custom operator image can be built using your local repository

```commandline
make image -e IMG=quay.io/<username>/opendatahub-operator:<custom-tag>
make image IMG=quay.io/<username>/opendatahub-operator:<custom-tag>
```

or (for example to user <username>)

```commandline
make image -e IMAGE_OWNER=<username>
make image IMAGE_OWNER=<username>
```

The default image used is `quay.io/opendatahub/opendatahub-operator:dev-0.0.1` when not supply argument for `make image`
Expand All @@ -173,7 +174,7 @@ e.g `make image-build -e IMAGE_BUILD_FLAGS="--build-arg USE_LOCAL=true"`
- Deploy the created image in your cluster using following command:

```commandline
make deploy -e IMG=quay.io/<username>/opendatahub-operator:<custom-tag> -e OPERATOR_NAMESPACE=<namespace-to-install-operator>
make deploy IMG=quay.io/<username>/opendatahub-operator:<custom-tag> OPERATOR_NAMESPACE=<namespace-to-install-operator>
```

- To remove resources created during installation use:
Expand Down Expand Up @@ -222,26 +223,18 @@ This will ensure that the doc for the apis are updated accordingly.

### Enabled logging

#### Controller level
Global logger configuration can be changed with an environemnt variable `ZAP_LOG_LEVEL`
or a command line switch `--log-mode <mode>` for example from CSV.
Command line switch has higher priority.
Valid values for `<mode>`: "" (as default) || prod || production || devel || development.

Logger on all controllers can only be changed from CSV with parameters: --log-mode devel
valid value: "" (as default) || prod || production || devel || development
Verbosity level is INFO.
To fine tune zap backend [standard operator sdk zap switches](https://sdk.operatorframework.io/docs/building-operators/golang/references/logging/)
can be used.

This mainly impacts logging for operator pod startup, generating common resource, monitoring deployment.

| --log-mode value | mapping Log level | Comments |
| ---------------- | ------------------- | -------------- |
| devel | debug / 0 | lowest level |
| "" | info / 1 | default option |
| default | info / 1 | default option |
| prod | error / 2 | highest level |

#### Component level

Logger on components can be changed by DSCI devFlags during runtime.
By default, if not set .spec.devFlags.logmode, it uses INFO level
Modification applies to all components, not only these "Managed" ones.
Update DSCI CR with .spec.devFlags.logmode, see example :
Log level can be changed by DSCI devFlags during runtime by setting
.spec.devFlags.logLevel. It accepts the same values as `--zap-log-level`
command line switch. See example :

```console
apiVersion: dscinitialization.opendatahub.io/v1
Expand All @@ -250,20 +243,17 @@ metadata:
name: default-dsci
spec:
devFlags:
logmode: development
logLevel: debug
...
```

Avaiable value for logmode is "devel", "development", "prod", "production".
The first two work the same set to DEBUG level; the later two work the same as using ERROR level.

| .spec.devFlags.logmode | stacktrace level | verbosity | Output | Comments |
| ---------------------- | ---------------- | --------- | -------- | -------------- |
| devel | WARN | INFO | Console | lowest level, using epoch time |
| development | WARN | INFO | Console | same as devel |
| "" | ERROR | INFO | JSON | default option |
| prod | ERROR | INFO | JSON | highest level, using human readable timestamp |
| production | ERROR | INFO | JSON | same as prod |
| logmode | stacktrace level | verbosity | Output | Comments |
|-------------|------------------|-----------|---------|-----------------------------------------------|
| devel | WARN | INFO | Console | lowest level, using epoch time |
| development | WARN | INFO | Console | same as devel |
| "" | ERROR | INFO | JSON | default option |
| prod | ERROR | INFO | JSON | highest level, using human readable timestamp |
| production | ERROR | INFO | JSON | same as prod |

### Example DSCInitialization

Expand Down Expand Up @@ -405,7 +395,7 @@ variable. Following table lists all the available flags to run the tests:
Example command to run full test suite skipping the test for DataScienceCluster deletion.

```shell
make e2e-test -e OPERATOR_NAMESPACE=<namespace> -e E2E_TEST_FLAGS="--skip-deletion=true"
make e2e-test OPERATOR_NAMESPACE=<namespace> E2E_TEST_FLAGS="--skip-deletion=true"
```

Example commands to run test suite for the dashboard `component` only, with the operator running out of the cluster.
Expand Down
Loading

0 comments on commit b27759b

Please sign in to comment.