Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Manifests for required CRs installation #105

Closed
Jooho opened this issue Oct 13, 2023 · 14 comments · Fixed by opendatahub-io/opendatahub-operator#691
Closed

New Manifests for required CRs installation #105

Jooho opened this issue Oct 13, 2023 · 14 comments · Fixed by opendatahub-io/opendatahub-operator#691
Assignees
Labels

Comments

@Jooho
Copy link

Jooho commented Oct 13, 2023

/kind feature

Describe the solution you'd like
In order to install the pre-requisite for KServe, users need to create various manifests manually.
To give a better installation impression, we should create new kustomize manifests. These manifests will be deployed by opendatahub operator using custom manifests.

Target manifests

  • Service Mesh
    • istio-proxies-monitor.yaml
    • istiod-monitor.yaml
    • smcp.yaml
    • smmr-X.yaml
  • Serverless
    • knativeserving-istio.yaml
    • gateways.yaml
    • certificate.yaml

custom-manifests
This task needs to be done after this issue

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@israel-hdez
Copy link

Would we assume a clean cluster? Or do we need to take into account existing installations?

@israel-hdez israel-hdez moved this from New/Backlog to In Progress in ODH Model Serving Planning Oct 20, 2023
@israel-hdez
Copy link

It's been decided to implement this on the operator and using DSCI for dependency installation.
Some code is going to be extracted from opendatahub-io/opendatahub-operator#605 to support it.

Ongoing/done tasks:

@Jooho
Copy link
Author

Jooho commented Oct 31, 2023

@israel-hdez we need to create monitoring-related manifests too.

oc apply -f custom-manifests/service-mesh/istiod-monitor.yaml 
oc apply -f custom-manifests/service-mesh/istio-proxies-monitor.yaml
oc apply -f custom-manifests/metrics/kserve-prometheus-k8s.yaml

if you have any questions please reach to @VedantMahabaleshwarkar

@heyselbi
Copy link

heyselbi commented Nov 1, 2023

@israel-hdez
Copy link

Prerequisites configuration in operator: opendatahub-io/opendatahub-operator#691

@israel-hdez
Copy link

israel-hdez commented Nov 7, 2023

Posting here, since operator team have their own testing process, and I didn't want to introduce noise with these instructions.

Operator PR testing instructions

PR link: opendatahub-io/opendatahub-operator#691

Smoke testing can be done by deploying a development build of the operator (rather than using OpenShift operatorhub). With a container build of the operator, the deployment can be done via kustomize.

Installing pre-requisites

The odh-operator will install the operands, but it won't install the operators. If you are using a clean OpenShift cluster, follow the official instructions to install Service Mesh and Serverless operators. If you already have these operators installed, go directly to deploying and testing the PR.

Emphasis on installing only the operators, and not creating any instance of the operands.

Deploying and testing the PR

Assuming testing is done in an OpenShift cluster, it is possible to use OpenShift Builds to create the container image and ease a little bit the process. You can find a BuildConfig in this gist. Download it to an empty directory, and name it as kserve-build-configs.yaml:

cd $(mktemp -d)
wget https://gist.githubusercontent.com/israel-hdez/0cdbc7c7e4ee77633e3cd70aa2a0667f/raw/dccd477aafe613a47f95acdd22994db7f17ea26e/kserve-build-configs.yaml

Create the opendatahub namespace in advance and create the BuildConfig:

oc new-project opendatahub
oc apply -f kserve-build-configs.yaml -l odh-component=operator

Create an ImageStream to store the image of the operator and issue a build of the pull request:

oc create imagestream -n opendatahub odh-operator
oc start-build -n opendatahub  odh-operator-bc --commit pull/691/head --wait

The terminal will unblock until the build finishes. Once done, deploy the operator. You will need the following kustomization.yaml file which you should save in your (no longer) empty directory:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- github.com/opendatahub-io/opendatahub-operator/config/default?ref=pull/691/head

namespace: opendatahub
images:
- name: REPLACE_IMAGE
  newName: image-registry.openshift-image-registry.svc:5000/opendatahub/odh-operator
  newTag: latest

Create and apply the manifests (i.e. deploy the operator) using this file:

kustomize build . | oc apply -f -

Once the operator pod is ready, proceed to edit the DSCInitializations resource that the operator should create automatically:

oc edit dsci default

On your editor, search for the fields spec.serviceMesh.managementState and spec.serverless.managementState and edit both fields to have Managed value, and save the changes.

Right after saving, the odh-operator should reconcile the changes and should install Service Mesh and KNative-Serving, configuring both to work together.

After ServiceMesh and Serverless-Serving installations are finished, you should be able to install KServe by creating/editing a DataScienceCluster resource as normally. Then, you should be able to deploy an InferenceService which should reply to requests without doing any further configurations to the cluster.

@bdattoma
Copy link

bdattoma commented Nov 8, 2023

The odh-operator will install the operands, but it won't install the operators. If you are using a clean OpenShift cluster, follow the official instructions to install Service Mesh and Serverless operators. If you already have these operators installed

What is the reason behind not installing the operators? @israel-hdez

@Jooho
Copy link
Author

Jooho commented Nov 8, 2023

@bdattoma Bcause user will install operators according to the official doc.

@bdattoma
Copy link

bdattoma commented Nov 8, 2023

@Jooho hm okay..

One question @israel-hdez @Jooho : what is expected to happen if ServiceMesh and Servelerss CRs (operands) are already deployed for other puposes?

@israel-hdez
Copy link

@Jooho hm okay..

One question @israel-hdez @Jooho : what is expected to happen if ServiceMesh and Servelerss CRs (operands) are already deployed for other puposes?

Since, at the moment, we are not supporting re-configuring an existent instance, the operator should just stop and fail with error.

Integrating to an existing instance would be a more complex case and I don't think we have enough time to deeply review the impact. It could be a enhancement for a next iteration.

@israel-hdez
Copy link

@bdattoma Bcause user will install operators according to the official doc.

Adding to this, it is another "layer" where it is repeated the technical difficulty about what to do if the operators are (or are not) already installed, and what to do if the installed operator is not the version we expect/support (since operator CRDs can also change in structure). This is not clearly defined and we may easily state in docs supported operator versions that users would need to install.

@bdattoma
Copy link

bdattoma commented Nov 9, 2023

the operator should just stop and fail with error.

so RHODS operators is capturing that Operands are already present, and fail? or you mean that the ServiceMesh and Serverless operators will fail?

@israel-hdez
Copy link

the operator should just stop and fail with error.

so RHODS operators is capturing that Operands are already present, and fail? or you mean that the ServiceMesh and Serverless operators will fail?

The former :-)
ODH/RHODS opereator should check if the operands are present, and fail if any of them do.

@bdattoma
Copy link

bdattoma commented Nov 9, 2023

okay thanks! this should go in documentation then.

@github-project-automation github-project-automation bot moved this from In Progress to Done in ODH Model Serving Planning Nov 14, 2023
Jooho pushed a commit to Jooho/kserve that referenced this issue Jan 11, 2024
Jooho pushed a commit to Jooho/kserve that referenced this issue Feb 28, 2024
…tudio-purge-kserve-art-explainer

Red Hat Konflux purge kserve-art-explainer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Status: No status
Status: Done
4 participants