Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespace stuck in Terminating mode due to Finalizer in spec and ansible-operator not triggering. #2362

Closed
RushinthJohn opened this issue Jan 4, 2020 · 9 comments
Labels
language/ansible Issue is related to an Ansible operator project triage/support Indicates an issue that is a support question.

Comments

@RushinthJohn
Copy link

Bug Report

What did you do?
I created an ansible-operator that watches the Namespaces.
The Ansible role is triggered successfully whenever a namespace is created.
But when I delete a namespace, the namespace is stuck in terminating mode.

What did you expect to see?
When the namespace was deleted the Ansible role mentioned has to be triggered.

What did you see instead? Under which circumstances?
When the namespace was deleted the Ansible role mentioned was not triggered and the namespace was stuck in Terminating mode.

Environment

  • operator-sdk version: v0.12.0

  • go version: go1.13.3 linux/amd64

  • Kubernetes version information:
    Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.11-gke.14", GitCommit:"56d89863d1033f9668ddd6e1c1aea81cd846ef88", GitTreeState:"clean", BuildDate:"2019-11-07T19:12:22Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}

  • Kubernetes cluster kind: GKE

  • Are you writing your operator in ansible, helm, or go?
    ansible

  • ansible-operator image: quay.io/operator-framework/ansible-operator:v0.13.0

Additional context
The watches.yaml looks like:

- version: v1
  kind: Namespace
  role: /opt/ansible/roles/acme
  finalizer:
    name: kubernetes
    role: /opt/ansible/roles/acme

The roles/acme/tasks/main.yml looks like:

- debug: msg='Role triggered'
@RushinthJohn
Copy link
Author

RushinthJohn commented Jan 4, 2020

This is similar to #1493, #1503, #1513

@camilamacedo86
Copy link
Contributor

PS. The #1493 was closed in favour of #1513 where has the steps to reproduce the scenario and a POC.

@hasbro17 hasbro17 added the language/ansible Issue is related to an Ansible operator project label Jan 6, 2020
@RushinthJohn
Copy link
Author

Hi, @camilamacedo86 #1513 has the steps to reproduce the scenario where the ansible-operator is watching a CR but in this case, the ansible-operator I created is not watching a CR but a Namespace which is a built-in type. As you can see from my watches.yaml the playbook/role is getting triggered when a new Namespace is created.

But the playbook/role is not getting triggered when I delete a Namespace, hence I had to add the 'kubernetes' finalizer in the watches.yaml, but now when I delete a Namespace the playbook/role is not getting triggered and it's stuck in terminating mode because of the 'kubernetes' finalizer present in the Namespace.

@RushinthJohn
Copy link
Author

The steps to reproduce this issue are:

  1. Build an ansible-operator with the mentioned watches.yaml and an Ansible role that just prints out a message
  2. Deploy the operator and create a Namespace
  3. By viewing the logs you can see that the Ansible role was triggered
  4. Now delete the created Namespace
  5. You will be able to see from the logs that the Ansible role was not triggered and the deleted Namespace is stuck in terminating mode.

@RushinthJohn RushinthJohn changed the title Namespace stuck in Terminating mode due to Finalizer in spec and ansible-operator not trgerring. Namespace stuck in Terminating mode due to Finalizer in spec and ansible-operator not triggering. Jan 7, 2020
@camilamacedo86 camilamacedo86 added the triage/support Indicates an issue that is a support question. label Jan 7, 2020
@camilamacedo86
Copy link
Contributor

Hi @fabianvf could you give a hand here? WDYT?

@fabianvf
Copy link
Member

fabianvf commented Jan 7, 2020

@RushinthJohn is the operator getting triggered on update? Curious if this is an issue specifically with deletion or if we're not getting events that come after creation period

@RushinthJohn
Copy link
Author

@fabianvf I performed an update on the Namespace by adding labels to its metadata and the Ansible role was not triggered. Not sure if adding labels constitutes an update event.

@RushinthJohn
Copy link
Author

Didn't face this issue with the latest version of ansible-operator base image. Closing this issue.

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Feb 27, 2020

Hi @RushinthJohn,

When the role: /opt/ansible/roles/acme be executed successfully the CR will be deleted. It will get stuck just/only if the playbook/role called in the finalizer: has issues.

However, in the 0.14 release, a bug fix was done which may be related to it as well which was to Fix Ansible based image in order to reconcile re-trigger when playbooks are running with an error. See in the CHANGELOG.

So, this also can be the reason for you are no longer facing this issue too. Thank you for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/ansible Issue is related to an Ansible operator project triage/support Indicates an issue that is a support question.
Projects
None yet
Development

No branches or pull requests

4 participants