-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kill job on task state reset from submitted or running #2621
Comments
Note that #2600 disallows manual reset to "ready" - although I suppose simply retriggering a running task, or resetting it wo "waiting" will have exactly the same effect! |
(It's arguable that this is a bug IMO, although I agree that attempting to kill the original job is preferable anyway). |
To further improve the issue reported in #2528, the logic for job 2 submission should check that job 1 is no longer running. It can then decide to either:
|
To be safe, I think that if you try to trigger or reset the state of a submitted or running task then, by default, this should fail. We would then need a force mode to override this. |
Perhaps a warning prompt/message could be issued/logged (just the GUI? interactive CLI?) on task reset/trigger before kill of found running/submitted job(s). This could be achieved via an optional request argument (Default; 'cancel_job=True' (kills existing running/submitted job)) when set to False will include a warning message in the response and not kill... What to do on failure to kill job 1? |
In 7.7+ messages from old jobs are ignored. |
The only problem left is that job 1 may continue to occupy the same computing resource that job 2 will require - causing job 2 to fail eventually. |
True, but not really a Cylc issue.. And given the reset/re-trigger is done manually, the user will have to be confident in their |
Also closed by #3515 |
From #2528.
When users reset the state of a submitted or running task to ready without killing the original job first can lead to existence of multiple jobs of the same task. This should also be handled correctly. We should have the suite make an automatic attempt to kill the original job when user resets the state of a submitted or running task.
Somewhat related (although not a submit number issue): careless (but common) use of suicide triggers can result in removing an active task proxy. Currently we just log a warning about this; we should probably kill the active job as well.
See:
#2528 (comment)
#2528 (comment)
#2528 (comment)
See also: #2199 #2394 #2506 #2618
The text was updated successfully, but these errors were encountered: