Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job remains pending then waiting few minutes #8080

Closed
zdriic opened this issue Sep 4, 2020 · 6 comments
Closed

Job remains pending then waiting few minutes #8080

zdriic opened this issue Sep 4, 2020 · 6 comments

Comments

@zdriic
Copy link

zdriic commented Sep 4, 2020

ISSUE TYPE
  • Bug Report
COMPONENT NAME
  • UI
  • task runner
SUMMARY

I have installed AWX 14.1.0 successfully but when I run the default project , the job is in "pending" mode then "waiting" before running but I have to wait a lot of time (several minutes). Why ??
There are no scheduled tasks and AWX is empty (project/templates/etc...) I haven't added anything

ENVIRONMENT
  • AWX version: 14.1.0
  • AWX install method: docker on linux
  • Ansible version: 2.9.10
  • Operating System: Ubuntu 18.04
  • Web Browser: Firefox
STEPS TO REPRODUCE

Database As A Service : PostgreSQL 10

when I run the default project , the job is in "pending" mode then "waiting" before running but I have to wait a lot of time (several minutes). Why ??
There are no scheduled tasks and AWX is empty (no project/templates/etc...) and juste default project
I haven't added anything

In awx_task logs I have this :

2020-09-03 06:43:18,179 WARNING awx.main.dispatch scaling down worker pid:5778
2020-09-03 06:43:18,179 DEBUG awx.main.dispatch task acfa974a-1e43-4d3e-bfc5-3fdda446414c starting awx.main.scheduler.tasks.run_task_manager([])
2020-09-03 06:43:18,189 DEBUG awx.main.scheduler Running Tower task manager.
2020-09-03 06:43:18,197 DEBUG awx.main.tasks Starting periodic scheduler
2020-09-03 06:43:18,195 WARNING awx.main.dispatch worker exiting gracefully pid:5778
2020-09-03 06:43:18,202 DEBUG awx.main.tasks Last scheduler run was: 2020-09-03 06:43:18.124123+00:00
2020-09-03 06:43:18,203 DEBUG awx.main.tasks Not running periodic scheduler, another task holds lock
2020-09-03 06:43:18,215 DEBUG awx.main.scheduler Not running scheduler, another task holds lock
2020-09-03 06:43:18,219 DEBUG awx.main.scheduler Not running scheduler, another task holds lock
2020-09-03 06:43:18,235 DEBUG awx.main.scheduler Not running scheduler, another task holds lock
2020-09-03 06:43:18,248 DEBUG awx.main.dispatch task e53ebc4c-5766-4be0-8adb-7cdefbacd558 starting awx.main.tasks.awx_k8s_reaper(
[])
2020-09-03 06:43:18,248 DEBUG awx.main.dispatch task b6915098-ee1d-4614-9b76-4c27d4a16c2b starting awx.main.tasks.cluster_node_heartbeat([])
2020-09-03 06:43:18,249 DEBUG awx.main.dispatch task d8f5b7ae-0e2b-40b9-861a-03360dcb262d starting awx.main.scheduler.tasks.run_task_manager(
[])
2020-09-03 06:43:18,251 DEBUG awx.main.scheduler Running Tower task manager.
2020-09-03 06:43:18,251 DEBUG awx.main.dispatch task a032b520-5009-4763-956a-b5e8739ab4a2 starting awx.main.tasks.awx_periodic_scheduler(*[])
2020-09-03 06:43:18,255 DEBUG awx.main.tasks Cluster node heartbeat task.
2020-09-03 06:43:18,273 DEBUG awx.main.tasks Starting periodic scheduler
2020-09-03 06:43:18,282 DEBUG awx.main.scheduler Finishing Scheduler
2020-09-03 06:43:18,282 DEBUG awx.main.scheduler Not running scheduler, another task holds lock
2020-09-03 06:43:18,285 DEBUG awx.main.tasks Last scheduler run was: 2020-09-03 06:43:18.198730+00:00
RESULT 2
OKREADY
RESULT 2
OKREADY
RESULT 2
OKREADY
RESULT 2
OKREADY
RESULT 2
OKREADY
RESULT 2
OKREADY

EXPECTED RESULTS

The job must be executed immediately.

ACTUAL RESULTS

The job remains "pending" for a few minutes before switching to "waiting" and then a few minutes later it is "running" and the job is running fine.

ADDITIONAL INFORMATION

It's seems to be the same as #3189 or #5617

Thanks in advance

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Sep 4, 2020

Database As A Service : PostgreSQL 10

Does this mean you're using a cloud-based external database, or postgres in a local container?

@zdriic
Copy link
Author

zdriic commented Sep 7, 2020

Database As A Service : PostgreSQL 10

Does this mean you're using a cloud-based external database, or postgres in a local container?

Yes I'm using a cloud-based external database.
But it worked well with awx-9.3.0.

Can it be related to this default setting "Isolated Launch Timeout:600" ?

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Sep 8, 2020

@zdriic I don't think so.

Not running scheduler, another task holds lock

This suggests to me that AWX's periodic task manager is busy scheduling other work. Do you have any details on the specs of your postgres server? You might want to investigate slow queries on your postgres server - generally speaking, with cloud-based databases you're sort of at the mercy of their postgres tuning, and it's pretty easy to see issues like the one you're describing with an underprovisioned postgres instance (or a slow underlying disk).

@zdriic
Copy link
Author

zdriic commented Sep 15, 2020

@zdriic I don't think so.

Not running scheduler, another task holds lock

This suggests to me that AWX's periodic task manager is busy scheduling other work. Do you have any details on the specs of your postgres server? You might want to investigate slow queries on your postgres server - generally speaking, with cloud-based databases you're sort of at the mercy of their postgres tuning, and it's pretty easy to see issues like the one you're describing with an underprovisioned postgres instance (or a slow underlying disk).

Hi @ryanpetrello

It seems that the problem comes from the link between AWX and an external database since AWX 11.0.0 (I would suggest since the first version with Redis instead of rabba itmq so AWX 10.0.0).
We tried AWX 14.1.0 with an internal container for the database and the jobs run immediately.

It seems it's the same problem as this issue: 7273

Thanks for your help

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Sep 15, 2020

Duplicate of #7253

@ryanpetrello ryanpetrello marked this as a duplicate of #7273 Sep 15, 2020
@ryanpetrello ryanpetrello marked this as a duplicate of #7253 Sep 15, 2020
@bdoublet91
Copy link

Hi,
I would reopen this issue.
I read the #7253 but I don't thinks it's the same problem.
First time I have installed awx, I did it with docker-compose and external psql 11 with awx 16.0.
awx and awx_task was launched without errors but jobs was still in pending and awx doesnt schedule task to awx_task.
Tried version 17.0, 15.0.1 with external postgresDB but the only way to make it works was to use postgres as a container like you do with ansible deployment.
I don't known if my problem fit exactly with the problem of this issue but I saw a lot of issues with awx and external database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants