-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job execution time limit enhancement #1929
Conversation
c7a3e38
to
6c19328
Compare
@@ -11,8 +11,8 @@ title = "User Guide [runtime] example." | |||
OBS:succeed-all => bar""" | |||
[runtime] | |||
[[root]] # base namespace for all tasks (defines suite-wide defaults) | |||
[[[job submission]]] | |||
method = at_now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was at_now
not different from at
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a very old example before at_now
was renamed to at
(so in theory we can run at
with different arguments, now just now
).
Preliminary review and testing looks ok to me. @cylc/core - please be aware of this change (though old syntax still supported). |
9dc5516
to
855daa4
Compare
'execution retry delays': vdr( | ||
vtype='interval_minutes_list', default=[]), | ||
'execution time limit': vdr( | ||
vtype='interval_seconds', default=[]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default shouldn't be a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this to have no default, i.e. None
.
Fixed typos. |
Two tests failing on my laptop VM (no proper batch systems):
with
It appears to be because |
The test failure is good. It means your environment is killing the job with the desired signal (XCPU). For some reason, I am unable to get XCPU in my environment. I'll modify the tests to expect either XCPU or ERR. |
Test(s) modified. |
Please give me a chance to comment before merging - thanks |
81bcb4a
to
bc610c8
Compare
Branch squashed and re-based. |
Review 1: this is good as far as I'm concerned, a nice improvement. |
3ef01a2
to
a1c5be6
Compare
Branch re-based. More docs added. |
@dpmatthews docs updated. |
I haven't reviewed all the documentation changes but the bits I wanted adding are now there - thanks |
Modify suite.rc spec * New [job] section for task run time * execution time limit * execution|submission retry delays * execution|submission polling intervals * batch system name and submit command template * shell * [event hooks] => [events] From the new execution time limit setting, * Generate time limit directives where actual directive not specified. * Generate logic to run background or at jobs with the timeout command. * On reaching the time limit, poll job configurable delays. The default is 1, 2 and 7 minutes intervals, which is roughly 1, 3 and 10 minutes after reaching time limit.
a1c5be6
to
99d05ef
Compare
Branch re-based. Conflicts resolved. Tests OK (with and without global configuration) in my environment. |
\lstinline=wall_clock_limit= directive. The setting is assumed to be the soft | ||
limit. The hard limit will be set by adding an extra minute to the soft limit. | ||
Do not specify the \lstinline=wall_clock_limit= directive explicitly if | ||
\lstinline=execution time limit= is specified, or it will cause confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally, what happens here? (parallel suite hat on here)
Generally looks good to me, couple of typos and a question but otherwise all good. Test battery passing in my environment. |
All doc comments fixed. |
Doc changes look good to me. |
Old configuration sections should just be deprecated and removed. Old items were not upgraded due to new settings already having default values.
Old configuration sections need to be deprecated with no new settings then removed.
Old configuration sections need to be deprecated with no new settings then removed.
Fix #1929 configuration upgrade logic
Modernise docs and tests for cylc/cylc-flow#1929 etc
Modify suite.rc spec
From the new execution time limit setting,
The default delays are 1, 2 and 7 minutes intervals,
roughly 1, 3 and 10 minutes after reaching time limit.
Close #1718.