Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconciler stops when Permissions object can't find user or vhost. Error vhost_or_user_not_found #372

Closed
carlosjgp opened this issue May 20, 2022 · 3 comments
Assignees
Labels
bug Something isn't working closed-stale stale

Comments

@carlosjgp
Copy link

Describe the bug

trying to create workaround this issue by deleting the User and associated Secret for the Operator to delete it and create it again with a new password I found this bug.

Reconciler loop fails to delete, at least 1, User is not deleted because the Permissions can't figure out the username ( ? )

To Reproduce

Steps to reproduce the behavior:

  1. Install RabbitMQ operator https://artifacthub.io/packages/helm/bitnami/rabbitmq.
  2. Apply YAML:initial. see below
  3. kubectl delete -f YAML:new
  4. Observe errors
    (sorry for the format the logs have been parsed)
    Error 400 (bad_request): vhost_or_user_not_found + stack trace
github.com/rabbitmq/messaging-topology-operator/controllers.(*PermissionReconciler).Reconcile
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/src/github.com/rabbitmq/rmq-messaging-topology-operator/controllers/permission_controller.go:130
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile	
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler	
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem	
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2	
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227

Reconciler error + stack trace

sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/bitnami/blacksmith-sandox/rmq-messaging-topology-operator-1.5.0/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
  1. Apply the new User CRs and secrets with new passwords expecting them to be updated. kubectl apply -f YAML:new

Include any YAML or manifest necessary to reproduce the problem.

YAML:initial

apiVersion: v1
kind: Secret
metadata:
  name: my-user
  annotations: {}
  labels: {}
type: Opaque
stringData:
  username: my-user
  password:  super-secret
---
apiVersion: rabbitmq.com/v1beta1
kind: User
metadata:
  name: my-user
  annotations: {}
  labels: {}
spec:
  tags:
  - management # available tags are 'management', 'policymaker', 'monitoring' and 'administrator'
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq
  importCredentialsSecret:
    name: my-user
---
apiVersion: rabbitmq.com/v1beta1
kind: Permission
metadata:
  name: my-user
  annotations: {}
  labels: {}
spec:
  vhost: /
  user: my-user # Same as the username in the secret
  permissions:
    write: ".*"
    configure: ".*"
    read: ".*"
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq
---
apiVersion: v1
kind: Secret
metadata:
  name: my-other-user
  annotations: {}
  labels: {}
type: Opaque
stringData:
  username: my-other-user
  password:  other-super-secret
---
apiVersion: rabbitmq.com/v1beta1
kind: User
metadata:
  name: my-other-user
  annotations: {}
  labels: {}
spec:
  tags:
  - management # available tags are 'management', 'policymaker', 'monitoring' and 'administrator'
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq
  importCredentialsSecret:
    name: my-other-user
---
apiVersion: rabbitmq.com/v1beta1
kind: Permission
metadata:
  name: my-other-user
  annotations: {}
  labels: {}
spec:
  vhost: /
  user: my-other-user # Same as the username in the secret
  permissions:
    write: ".*"
    configure: ".*"
    read: ".*"
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq

YAML:new

apiVersion: v1
kind: Secret
metadata:
  name: my-user
  annotations: {}
  labels: {}
type: Opaque
stringData:
  username: my-user
  password:  new-more-awesom-secret
---
apiVersion: rabbitmq.com/v1beta1
kind: User
metadata:
  name: my-user
  annotations: {}
  labels: {}
spec:
  tags:
  - management # available tags are 'management', 'policymaker', 'monitoring' and 'administrator'
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq
  importCredentialsSecret:
    name: my-user
---
apiVersion: v1
kind: Secret
metadata:
  name: my-other-user
  annotations: {}
  labels: {}
type: Opaque
stringData:
  username: my-other-user
  password:  other-new-more-awesom-secret
---
apiVersion: rabbitmq.com/v1beta1
kind: User
metadata:
  name: my-other-user
  annotations: {}
  labels: {}
spec:
  tags:
  - management # available tags are 'management', 'policymaker', 'monitoring' and 'administrator'
  rabbitmqClusterReference:
    name: rabbitmq
    namespace: rabbitmq
  importCredentialsSecret:
    name: my-other-user

Expected behavior

  • Reconciler ignores Permission objects that can't be reconciled.
  • Users are deleted
  • Users are created again with new passwords

Screenshots

Version and environment information

  • Helm chart: bitnami/rabbitmq-cluster-operator:2.6.2
  • Messaging Topology Operator: 1.5.0
  • RabbitMQ: 3.9.13
  • RabbitMQ Cluster Operator: 1.13.0
  • Kubernetes: 1.21
  • Cloud provider or hardware configuration: AWS EKS

Additional context

Add any other context about the problem here.

@carlosjgp carlosjgp added the bug Something isn't working label May 20, 2022
@ChunyiLyu
Copy link
Contributor

ChunyiLyu commented May 24, 2022

Hi @carlosjgp I cannot reproduce the same error you've shown with the first yaml. Everything successfully applied without any errors for me at first try:

❯ k apply -f first.yaml
secret/my-user configured
user.rabbitmq.com/my-user configured
permission.rabbitmq.com/my-user created
secret/my-other-user created
user.rabbitmq.com/my-other-user created
permission.rabbitmq.com/my-other-user created

To reproduce your situation, I created an additional permission.rabbitmq.com object to refer to an non-existing user to get the vhost_or_user_not_found error

---
apiVersion: rabbitmq.com/v1beta1
kind: Permission
metadata:
  name: test-perm
spec:
  vhost: /
  user: invalid
  permissions:
    write: ".*"
    configure: ".*"
    read: ".*"
  rabbitmqClusterReference:
    name: sample

and I was able to see the same error as there is no invalid user in my rmq:

{"level":"error","ts":1653397662.9820182,"logger":"controller.permission","msg":"failed to set permission","reconciler group":"rabbitmq.com","reconciler kind":"Permission","name":"test-perm","namespace":"rabbitmq-system","user":"invalid","vhost":"/","error":"Error 400 (bad_request): vhost_or_user_not_found","stacktrace":"github.com/rabbitmq/messaging-topology-operator/controllers.(*PermissionReconciler).Reconcile\n\t/workspace/controllers/permission_controller.go:130\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.2/pkg/internal/controller/controller.go:114
...

Then I tried to delete all the previously created objects and had no issue:

❯ k delete -f first.yaml
secret "my-user" deleted
user.rabbitmq.com "my-user" deleted
permission.rabbitmq.com "test-perm" deleted
secret "my-other-user" deleted
user.rabbitmq.com "my-other-user" deleted
permission.rabbitmq.com "my-other-user" deleted

Operator logs:

{"level":"info","ts":1653397706.5631347,"logger":"controller.user","msg":"Deleting","reconciler group":"rabbitmq.com","reconciler kind":"User","name":"my-user","namespace":"rabbitmq-system"}
{"level":"info","ts":1653397706.6012213,"logger":"controller.permission","msg":"Deleting","reconciler group":"rabbitmq.com","reconciler kind":"Permission","name":"test-perm","namespace":"rabbitmq-system"}
{"level":"info","ts":1653397706.6054158,"logger":"controller.permission","msg":"cannot find user or vhost in rabbitmq server; no need to delete permission","reconciler group":"rabbitmq.com","reconciler kind":"Permission","name":"test-perm","namespace":"rabbitmq-system","user":"invalid","vhost":"/"}
{"level":"info","ts":1653397706.6716747,"logger":"controller.user","msg":"Deleting","reconciler group":"rabbitmq.com","reconciler kind":"User","name":"my-other-user","namespace":"rabbitmq-system"}
{"level":"info","ts":1653397706.7114959,"logger":"controller.permission","msg":"Deleting","reconciler group":"rabbitmq.com","reconciler kind":"Permission","name":"my-other-user","namespace":"rabbitmq-system"}
{"level":"info","ts":1653397706.7158935,"logger":"controller.permission","msg":"cannot find user or vhost in rabbitmq server; no need to delete permission","reconciler group":"rabbitmq.com","reconciler kind":"Permission","name":"my-other-user","namespace":"rabbitmq-system","user":"my-other-user","vhost":"/"}

Was your Topology Operator stuck at the error you've shown here or did it reconcile in the end? Do you still have access to the same environment? It think the full operator logs when it happened would be very useful here as I cannot reproduce it.

@ChunyiLyu ChunyiLyu self-assigned this May 24, 2022
@github-actions
Copy link

This issue has been marked as stale due to 60 days of inactivity. Stale issues will be closed after a further 30 days of inactivity; please remove the stale label in order to prevent this occurring.

@github-actions github-actions bot added the stale label Jul 24, 2022
@github-actions
Copy link

Closing stale issue due to further inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working closed-stale stale
Projects
None yet
Development

No branches or pull requests

2 participants