-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abort job script with custom message. #2298
Conversation
Some quick answers:
Some quick questions:
|
a7b2747
to
f6c3405
Compare
60bb2bb
to
9146b42
Compare
[critical handler functionality moved to #2304] |
Done. |
334878e
to
09a09e7
Compare
lib/cylc/task_message.py
Outdated
self.CYLC_JOB_EXIT, "EXIT")) | ||
job_status_file.write("%s=%s\n" % ( | ||
self.CYLC_JOB_EXIT_MESSAGE, | ||
message.replace(self.ABORT_MESSAGE_PREFIX, ""))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the right thing to do here? (new CYLC_JOB_EXIT_MESSAGE in status files, just for aborted jobs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could just set CYLC_JOB_EXIT to the abort message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just set CYLC_JOB_EXIT
to the abort message. If I remember correctly, it can be anything other than SUCCEEDED
, ERR
and EXIT
. cylc jobs-poll
uses it as the value of the signal which can be any single-line strings.
Otherwise, if we want to introduce CYLC_JOB_EXIT_MESSAGE
, I suppose CYLC_JOB_EXIT
should be set to ABORT
or ABRT
(or something like that), and you may want to ensure that cylc jobs-poll
can return it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually used CYLC_JOB_EXIT originally, then changed it. I"ll revert to that.
Good point - so I left the abort call as a shell function. |
092888b
to
e4da521
Compare
(rebased to remove the CYLC_JOB_EXIT_MESSAGE change). |
@matthewrmshin - please assign a second reviewer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments for the tests.
tests/events/34-task-abort.t
Outdated
suite_run_ok $TEST_NAME cylc run --no-detach --debug $SUITE_NAME | ||
#------------------------------------------------------------------------------- | ||
LOG="${SUITE_RUN_DIR}/log/job/1/foo/NN/job-activity.log" | ||
cat "${LOG}" | grep "event-handler" > 'edited-job-activity.log' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can just do:
grep 'event-handler' "${LOG}" > 'edited-job-activity.log'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course (I cut-n-pasted then cut down another test 😬)...
script = """ | ||
echo ONE | ||
cylc__job_abort "ERROR: rust never sleeps" | ||
echo TWO""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a test to ensure job.out
only has ONE
and not TWO
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spotting - that was my original intention (ran out of time earlier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Feedback addressed. Over 'n out. |
Close #2274
general CRITICAL event handler (c.f. our current WARNING handler)[now in Add CRITICAL event handler. #2304]Test case to show the abort function works properly with execution retries (abort message is logged each try, but only passed to the failed handler at last try) and compare with trapped exit:
@matthewrmshin - see if you agree with the approach before I and document and add tests. It's more or less in line with our discussion on #2274.
Note slightly unorthodox implementation by means of a shell function that the user has to call in task scripting. Probably should hook this into
cylc message
or a new commandcylc abort
? (It would be nice to avoid yet another command, butcylc message
doesn't seem very appropriate in this case...cylc message --abort
maybe ...)