Skip to content

Commit

Permalink
fix: Fix namespace-scoped install
Browse files Browse the repository at this point in the history
Motivation

There are some problems when installing in namespace-scoped mode:
- Though the docs mentioned that the RBAC should be changed, the install script currently applies the controller ClusterRole and ClusterRoleBinding (including namespace resource permission) regardless of whether the --namespace-scope-mode option is provided
- The combination of the NAMESPACE_SCOPE=true env var and namespace resource permissions results in a broken/conflicted state where the controller will "mostly" operate in cluster scoped mode and watch / delete resources in other namespaces

Modifications
- Split RBAC manifests into cluster-scope and namespace-scope variants, the latter including Role/RoleBinding instead of Cluster level equivalents
- Have install script apply the appropriate RBAC based on the namespace-scope-mode cli setting (in addition to setting the controller env var as is already done)
- Change the controller init logic to always use namespace mode if the env var is set
- Update documentation accordingly

Result

Namespace scoped install works properly, won't touch any other namespaces

Signed-off-by: Nick Hill <nickhill@us.ibm.com>
  • Loading branch information
njhill committed Aug 19, 2022
1 parent d879588 commit c670458
Show file tree
Hide file tree
Showing 30 changed files with 285 additions and 36 deletions.
1 change: 0 additions & 1 deletion config/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,5 +92,4 @@ apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../crd
- ../rbac
- ../manager
17 changes: 17 additions & 0 deletions config/rbac/cluster-scope/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
resources:
- ../common
- role.yaml
- role_binding.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2021 IBM Corporation
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2021 IBM Corporation
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
# limitations under the License.
resources:
- service-account.yaml
- role.yaml
- role_binding.yaml
- leader_election_role.yaml
- leader_election_role_binding.yaml
- restricted_scc_role.yaml
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
17 changes: 17 additions & 0 deletions config/rbac/namespace-scope/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
resources:
- ../common
- role.yaml
- role_binding.yaml
166 changes: 166 additions & 0 deletions config/rbac/namespace-scope/role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: modelmesh-controller-role
rules:
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- secrets
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- ""
resources:
- services
- services/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- apps
resources:
- deployments
- deployments/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- serving.kserve.io
resources:
- inferenceservices
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- serving.kserve.io
resources:
- inferenceservices/finalizers
verbs:
- get
- patch
- update
- apiGroups:
- serving.kserve.io
resources:
- inferenceservices/status
verbs:
- get
- patch
- update
- apiGroups:
- serving.kserve.io
resources:
- predictors
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- serving.kserve.io
resources:
- predictors/finalizers
verbs:
- get
- patch
- update
- apiGroups:
- serving.kserve.io
resources:
- predictors/status
verbs:
- get
- patch
- update
- apiGroups:
- serving.kserve.io
resources:
- servingruntimes
- servingruntimes/finalizers
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- serving.kserve.io
resources:
- servingruntimes/status
verbs:
- get
- patch
- update
24 changes: 24 additions & 0 deletions config/rbac/namespace-scope/role_binding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright 2022 IBM Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: modelmesh-controller-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: modelmesh-controller-role
subjects:
- kind: ServiceAccount
name: modelmesh-controller
4 changes: 2 additions & 2 deletions docs/install/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

- **S3-compatible object storage** - Before models can be deployed, a remote S3-compatible datastore is needed from which to pull the model data. This could be for example an [IBM Cloud Object Storage](https://www.ibm.com/cloud/object-storage) instance, or a locally running [MinIO](https://github.com/minio/minio) deployment. Note that this is not required to be in place prior to the initial installation.

We provide an install script to quickly run ModelMesh Serving with a provisioned etcd server. This may be useful for experimentation or development but should not be used in production.
We provide an install script `--quickstart` option to quickly run ModelMesh Serving with a provisioned etcd server. This may be useful for experimentation or development but should not be used in production.

## Cluster Scope or Namespace Scope

Expand All @@ -19,7 +19,7 @@ ModelMesh Serving can be used in either cluster scope or namespace mode.
- **Cluster scope mode** - Its components can exist in multiple user namespaces which are controlled by one instance of ModelMesh Serving Controller in the control plane namespace. Only one ModelMesh Serving instance can be installed within a Kubernetes cluster. A namespace label `modelmesh-enabled` needs to be "true" to enable a user namespace for ModelMesh Serving.
- **Namespace scope mode** - All of its components must exist within a single namespace and only one instance of ModelMesh Serving can be installed per namespace. Multiple ModelMesh Serving instances can be installed in separate namespaces within the cluster.

The default configuration is for the cluster scope mode. Change RBAC permissions, cluster role to role, and cluster role binding to role binding, to deploy ModelMesh Serving in the namespace scope mode.
The default configuration is for the cluster scope mode. Use the `--namespace-scope-mode` option of the install script for namespace scope.

## Deployed Components

Expand Down
42 changes: 20 additions & 22 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,6 @@ func main() {
var leaseDuration time.Duration
var leaseRenewDeadline time.Duration
var leaseRetryPeriod time.Duration
var clusterScopeMode bool
flag.StringVar(&metricsAddr, "metrics-addr", ":8080", "The address the metric endpoint binds to.")
flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.")
flag.BoolVar(&enableLeaderElection, "enable-leader-election", false,
Expand All @@ -187,43 +186,42 @@ func main() {
"Duration the Leader elector clients should wait between tries of actions.")
flag.Parse()

mgrNamespace := ""
trueString := "true"

// Controller can be in namespace or cluster scope mode depending on an env variable
clusterScopeMode = os.Getenv(NamespaceScopeEnvVar) != trueString
clusterScopeMode := os.Getenv(NamespaceScopeEnvVar) != "true"

// Here we check whether RBAC is set for cluster scope
err = cl.Get(context.Background(), client.ObjectKey{Name: "foo"}, &corev1.Namespace{})
hasClusterPermissions := err == nil || errors.IsNotFound(err)

if err == nil || errors.IsNotFound(err) {
// Controller has cluster permissions
if clusterScopeMode {
setupLog.Info("Controller operating in cluster scope mode, will attempt to watch all namespaces")
} else {
// Config mismatch, namespace mode with cluster permissions, will continue with a log
setupLog.Info("In namespace scope mode but controller has cluster scope permissions, continue")
}
} else {
// Controller has namespace permissions
if clusterScopeMode {
if clusterScopeMode {
if !hasClusterPermissions {
// Config mismatch, cluster mode without cluster permissions, exit
setupLog.Error(fmt.Errorf("Insufficient permission for controller"), "In cluster scope mode but controller has namespace scope permissions, exit")
setupLog.Error(nil, "In cluster scope mode but controller does not have cluster scope permissions, exiting")
os.Exit(1)
} else {
mgrNamespace = ControllerNamespace
setupLog.Info("Controller operating in own-namespace only mode")
}
setupLog.Info("Controller operating in cluster scope mode, will attempt to watch/manage all namespaces")
} else {
// Namespace-scope mode configured
setupLog.Info("Controller operating in namespace scope (own-namespace only) mode",
"namespace", ControllerNamespace)

if hasClusterPermissions {
setupLog.Error(nil, "Warning: In namespace scope mode but controller has permission to access cluster namespace resources")
}
}

mgrOpts := ctrl.Options{
Scheme: scheme,
MetricsBindAddress: metricsAddr,
Namespace: mgrNamespace,
Port: 9443,
HealthProbeBindAddress: probeAddr,
}

if !clusterScopeMode {
// Set manager to operate scoped to our namespace
mgrOpts.Namespace = ControllerNamespace
}

if enableLeaderElection {
if leaderElectionType == "lease" {
setupLog.Info("using leader-with-lease election method")
Expand Down Expand Up @@ -317,7 +315,7 @@ func main() {
registryMap[registryKey] = registryValue
setupLog.Info(fmt.Sprintf("Reconciliation of %s is enabled", resourceName))
return true
} else if envVarVal == trueString {
} else if envVarVal == "true" {
// If env var is explicitly true, require that specified CRD is present
setupLog.Error(err, fmt.Sprintf("Unable to access %s Custom Resource", resourceName))
os.Exit(1)
Expand Down
17 changes: 15 additions & 2 deletions scripts/delete.sh
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,12 @@ if [[ -n $namespace ]]; then
fi

# Ensure the namespace is overridden for all the resources
cd default
pushd default
kustomize edit set namespace "$namespace"
cd ..
popd
pushd ../rbac/namespace-scope
kustomize edit set namespace "$namespace"
popd

# Older versions of kustomize have different load restrictor flag formats.
# Can be removed once Kubeflow installation stops requiring v3.2.
Expand Down Expand Up @@ -115,7 +118,17 @@ if [[ ! -z $user_ns_array ]]; then
rm runtimes.yaml
fi

# Determine whether a modelmesh-controller-rolebinding clusterrolebinding exists and is
# associated with the service account in this namespace. If not, don't delete the cluster level RBAC.
set +e
crb_ns=$(oc get clusterrolebinding modelmesh-controller-rolebinding -o json | jq -r .subjects[0].namespace)
set -e
if [[ "$crb_ns" == "$namespace" ]]; then
echo "deleting cluster scope RBAC"
kustomize build rbac/cluster-scope | kubectl delete -f - --ignore-not-found=true
fi
kustomize build default | kubectl delete -f - --ignore-not-found=true
kustomize build rbac/namespace-scope | kubectl delete -f - --ignore-not-found=true
kustomize build runtimes ${kustomize_load_restrictor_arg} | kubectl delete -f - --ignore-not-found=true
kubectl delete -f dependencies/quickstart.yaml --ignore-not-found=true
kubectl delete -f dependencies/fvt.yaml --ignore-not-found=true
Expand Down
Loading

0 comments on commit c670458

Please sign in to comment.