-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contact: replace ps system call with psutil #4421
Conversation
0861e15
to
2fad30e
Compare
cmd = ['cylc', 'psutil'] | ||
metric = f'[["Process", {pid}]]' | ||
if is_remote_host(host): | ||
cmd = _construct_ssh_cmd(cmd, host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note:
construct_ssh_cmd
takesplatform
as an argument and respects the platform config (incl SSH settings) whereas_construct_ssh_cmd
takeshost
as an argument and uses the config forlocalhost
.We were doing
construct_ssh_cmd(get_platform())
before which is effectively the same.
@@ -98,7 +97,6 @@ def daemonize(schd): | |||
"host": schd.host, | |||
"url": workflow_url, | |||
"pub_url": pub_url, | |||
"ps_opts": PS_OPTS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed one field from the daemonization output.
cylc/flow/scheduler.py
Outdated
fields.PID: | ||
str(proc.pid), | ||
fields.COMMAND: | ||
' '.join(proc.cmdline()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split the PROCESS
contact file field into PID
and COMMAND
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look through, looks good, and much clearer than the old code 👍
2fad30e
to
56242d2
Compare
702f976
to
5a05884
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. I'm on Windows at the moment, so will test tomorrow with that alpine container from my Linux box 👍 thanks!
"""The process ID of the running workflow on ``CYLC_WORKFLOW_HOST``.""" | ||
|
||
COMMAND = 'CYLC_WORKFLOW_COMMAND' | ||
"""The command that was used to run the workflow on ``CYLC_WORKFLOW_HOST```. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CYLC_WORKFLOW_HOST
has one more quote on its right hand side?
Integration test is failing on Mac OS (GitHub), putting this on hold until resolved. |
5a05884
to
3044ba1
Compare
Co-authored-by: Bruno P. Kinoshita <kinow@users.noreply.github.com>
3044ba1
to
7377b3c
Compare
The MacOS failure was related to the parallelism of tests on the GH runner, the PID test I had written was flagging false positives which couldn't be safely fixed so I have removed it. |
A GH macos job is still failing 😰 |
Different test thankfully, should pass on rerun. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on Alpine container with this diff:
diff --git a/cylc-flow/8.0b2/Dockerfile b/cylc-flow/8.0b2/Dockerfile
index 34ce558..804c351 100644
--- a/cylc-flow/8.0b2/Dockerfile
+++ b/cylc-flow/8.0b2/Dockerfile
@@ -43,7 +43,7 @@ RUN apk update && \
g++ \
linux-headers \
python3-dev && \
- pip install --no-cache-dir "cylc-flow==${cylc_docker_version}" && \
+ pip install --no-cache-dir git+https://github.com/oliver-sanders/cylc-flow.git@7377b3cffe57ac01102cbc77e0ae669266bd95cf#egg=cylc-flow && \
apk del build-dependencies
# Add a non-root user
@@ -69,7 +69,7 @@ RUN chown -R ${USER_NAME}:${USER_GROUP} ${USER_ROOT} && \
# FIXME: https://github.com/cylc/cylc-flow/issues/4416
# hacking due to incompatible ps args
-RUN sed -i 's/wopid/opid/' /usr/local/lib/python3.7/site-packages/cylc/flow/workflow_files.py
+# RUN sed -i 's/wopid/opid/' /usr/local/lib/python3.7/site-packages/cylc/flow/workflow_files.py
Workflow ran with no issues this time 🎉 Thanks!
|
raise CylcError( | ||
f'The workflow is no longer running at {host}:{port}\n' | ||
f'It has moved to {contact_host}:{contact_port}' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes cylc stop --poll
to fail if the target workflow got restarted quickly, hence failure of tests/f/restart/00-pre-initial.t
. I'll push a quick fix and merge the PR if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The fix is to intercept CylcError in the poller, not to avoid raising it here).
Closes #4416
Closes #3965
pid & contact fields.
ps
call with Pythonpsutil
.Requirements check-list
CONTRIBUTING.md
and added my name as a Code Contributor.setup.py
andconda-environment.yml
.