Skip to content
This repository has been archived by the owner on Mar 20, 2023. It is now read-only.

Strange error when adding pool failed #307

Closed
borisklug opened this issue Aug 30, 2019 · 1 comment
Closed

Strange error when adding pool failed #307

borisklug opened this issue Aug 30, 2019 · 1 comment
Assignees
Labels

Comments

@borisklug
Copy link

borisklug commented Aug 30, 2019

Problem Description

We add a pool with about 150 low prio machines. Sometime not all node reach the state idle, they are stuck in the stage starting. After a while (timeout?) shipyards seems to detect this and exists with signal 255. In the log (whole log see below) this is the error message:

AttributeError: 'NoneType' object has no attribute 'version'

Batch Shipyard Version

3.8.1

Steps to Reproduce

Try to add a pool with a lot of low prio machines. The situation decribed above happend not everytime, maybe 1 of 5 or so.

Expected Results

Not an error at all. Just in the statistic output you would see the result in the Node states table.

Actual Results

Error message, see log below

Additional Logs

Starting
- adding pool 'poolname'
  - [1465] Failed to execute script shipyard
  - Traceback (most recent call last):
  -   File "shipyard.py", line 3135, in <module>
  -   File "site-packages/click/core.py", line 764, in __call__
  -   File "site-packages/click/core.py", line 717, in main
  -   File "site-packages/click/core.py", line 1137, in invoke
  -   File "site-packages/click/core.py", line 1137, in invoke
  -   File "site-packages/click/core.py", line 956, in invoke
  -   File "site-packages/click/core.py", line 555, in invoke
  -   File "site-packages/click/decorators.py", line 64, in new_func
  -   File "site-packages/click/core.py", line 555, in invoke
  -   File "shipyard.py", line 1545, in pool_add
  -   File "convoy/fleet.py", line 3356, in action_pool_add
  -   File "convoy/fleet.py", line 1812, in _add_pool
  -   File "convoy/batch.py", line 956, in create_pool
  -   File "convoy/batch.py", line 894, in wait_for_pool_ready
  -   File "convoy/batch.py", line 757, in _block_for_nodes_ready
  -   File "convoy/batch.py", line 3041, in list_nodes
  - AttributeError: 'NoneType' object has no attribute 'version'
ERROR: Script 'myscript' canceled with signal 255 at line 376: function 'runShipyard'!
@alfpark
Copy link
Collaborator

alfpark commented Aug 30, 2019

It's possible that the compute node was preempted while it was initializing. This could have resulted in some fields not being populated.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants