-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: execute user job scripts in subshell #2885
Conversation
The trouble with this is, we occasionally want to deliberately modify the cylc environment. The test battery is failing, for instance (well, maybe this is the only use case) because job scripts in some tests unset the EXIT trap in order to exit early without triggering a task failed message. |
Correctly doing this should help address #2043. But we need to considerate a few things:
|
(
This (2nd point) had me scratching my head for a bit while testing the new |
@hjoliver, @matthewrmshin, feel free to close this PR if you think it's not appropriate. I just thought the current setup, where the user code can trample over Cylc environment with no restrictions is not very robust. |
@TomekTrzeciak - the subshell is better in that respect, we just need to make sure we can handle all use cases... |
I agree in principle, but we do deliberately prefix all Cylc environment variables with |
An example from the past was using As you mentioned, you may want to set your own traps for signals. There are also system environment variables that can muck around with your execution environment in unpredictable ways (like LD_PRELOAD). Allowing user code to run without separation from Cylc code just makes things fragile and limits what you can do in your job script. |
@TomekTrzeciak - fair enough (I've seen the |
@matthewrmshin - you probably have the best understanding of the complications you refer to above #2885 (comment) do you think they can be addressed easily enough now, or should we close this PR for now and reference it in a new Issue for later development? |
Ref #2885 (comment) Bullet point 1 and 3 should be easy, but require some careful testing. Bullet point 2 is tricky to do correctly (need to trap user abort only once). @TomekTrzeciak Do you want to continue to pursue this PR? (I can talk you through the issues.) Or do you want to bury this for now? |
@matthewrmshin I'm still interested in this, but depends a bit on how much work it would require. As for bullet point 2 from your list (user aborts), how about combining it with point 3 (batch system signals) in the following way:
|
@TomekTrzeciak I'll close this PR down for now. Please raise a new PR if you want to continue with this work. |
@matthewrmshin @TomekTrzeciak - I think we generally agree that the intent of this PR is desirable, so probably should open a new issue as a placeholder for post-cylc-8 work (unless @TomekTrzeciak can do it now, considering the complications we raised above)? |
Can probably add to #2043. |
Done. (Added to #2043) |
@hjoliver, @matthewrmshin, I'm unlikely to find the time for it this month. I could implement something that I think is right behaviour quite quickly, but getting all the testing and documentation sorted out at the same time to get this PR over the line would take me too long. |
@TomekTrzeciak - no worries, just flagging that we're up for it if you do happen to have the time! |
Add execution of user job scripts in subshell to isolate Cylc job script from environment changes in user scripts.