Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI fails if cylc executables are not in $PATH #3915

Closed
jhaiduce opened this issue Nov 4, 2020 · 7 comments
Closed

CLI fails if cylc executables are not in $PATH #3915

jhaiduce opened this issue Nov 4, 2020 · 7 comments
Labels
bug Something is wrong :(

Comments

@jhaiduce
Copy link
Contributor

jhaiduce commented Nov 4, 2020

Describe the bug
If the cylc commands aren't in $PATH, most CLI commands fail with "No such file or directory" errors.

Release version(s) and/or repository branch(es) affected?
Current master

Steps to reproduce the bug
Install cylc in an environment where its executables are not in $PATH (for example a virtualenv or with pip install --user). Run a cylc command using the full path to executable (such as /full/path/to/cylc/binaries/cylc --help)

Expected behavior
Should print a help message. Instead the command fails with a "No such file or directory" error because "cylc-help" is not in $PATH.

@jhaiduce jhaiduce added the bug Something is wrong :( label Nov 4, 2020
@kinow
Copy link
Member

kinow commented Nov 4, 2020

I tried the --user flag (which I use sometimes with pip, but never tried on Cylc):

$ cd cylc-flow
$ pip install --user .
$ cylc run --no-detach five

That works with no issues. Looking where's Cylc installed:

$ which cylc
/home/kinow/.local/bin/cylc

If I modify my $PATH, then I get the errors that I suspect you have @jhaiduce , where cylc i not found and the command just fails.

I think that means that the default location used by pip, when you specify the --user flag is missing from your environment $PATH @jhaiduce ? When you installed with --user, did you end up with a cylc executable in your $HOME/.local/bin/cylc? If so, I guess a simpler fix would be to modify the PATH env var?

@jhaiduce
Copy link
Contributor Author

jhaiduce commented Nov 4, 2020

Sorry, I should have been more explicit about what I did. I invoked cylc using an explicit path, knowing that cylc wasn't in $PATH. For instance:

$ python -m venv venv
$ venv/bin/pip install .
$ venv/bin/cylc --help

I expected cylc to work (as many commands do) when invoked in this way. Instead, I got

Error: [Errno 2] No such file or directory: 'cylc-help'

Most of the cylc sub-commands fail in a similar way. For instance:

$ venv/bin/cylc validate
Error: [Errno 2] No such file or directory: 'cylc-validate'

It seems that the something in the cylc CLI assumes that its entry points are all installed to a directory that is included in $PATH. Obviously this is fairly easy to work around by modifying $PATH. However, in my opinion it would be better to avoid the assumption that the entry points are in $PATH in the first place. That would simplify things for users and would potentially make cylc more resilient to changes in environment.

I'm working on a PR to address this. It prevents the problem described above, but the implementation still needs some work before it's ready to merge (my changes break some of the integration and functional tests).

@kinow
Copy link
Member

kinow commented Nov 5, 2020

@jhaiduce is there are reason why you couldn't source/activate the venv you created?

From your explanation, I am assuming you created a Python virtual environment venv, where Cylc 8 was installed, but you executed the script directly venv/bin/cylc, without activating the environment first?

With the virtual environment activated/source'd, the PATH variable should be automagically updated by the activation script, and the cylc workflows/tasks/jobs should all work fine I think?

@hjoliver
Copy link
Member

hjoliver commented Nov 5, 2020

@jhaiduce - thanks for getting involved with Cylc 👍 Hopefully we'll see more of you.

On this issue, like @kinow I'm a bit confused about why you tried to run cylc in the way described above. Virtual environments are normally supposed to be "activated" (. venv/bin/activate) which sets your environment (including $PATH) appropriately for the venv, then pip install installs into the venv and everything just works. But is there some reason why you need to do it via direct invocation by file path?

And, to my understanding at least, pip install --user is for installing packages against the system Python (to which you don't have write access) into your home directory, and for that you need to put the install path under $HOME/.local/ into your $PATH yourself.

@jhaiduce
Copy link
Contributor Author

jhaiduce commented Nov 5, 2020

@kinow, @hjoliver - I could certainly update $PATH, either directly or using venv/bin/activate. It is certainly not difficult to do. I just tried it to make sure, and it indeed allows cylc to work correctly.

Thanks for hearing me out on this one. Here's a (somewhat long-winded) answer to why I think it's better that a package such as cylc not rely on having its entry points in $PATH. It basically boils down to three factors: 1. Not all users 'activate' their venv's as a matter of course, and those who don't will be suprised that cylc expects it. 2. The venv documentation says that things should still work if you don't activate a venv. 3. There are some (admittedly uncommon) situations where reliance on $PATH can cause bugs even with the cylc executables accessible through $PATH.

To elaborate on the first point, my own habit is to skip the activate step and either use relative paths into the virtualenv directory or use full paths with the help of an environment variable pointing to the venv directory. I've seen similar usage patterns in tutorials such as this one, so I'm reasonably confident that there a substantial subset of python users out there is using venv's without activating them. Cylc is the first package I've encountered up til now that did not work without activating the venv first, so my initial reaction was to perceive the behavior as a bug. If cylc doesn't rely on $PATH to find its components, that will shorten the learning curve for new users (such as myself) who don't usually activate their venv's.

As for the second factor, the documentation of venv includes the following:

You don’t specifically need to activate an environment; activation just prepends the virtual environment’s binary directory to your path, so that “python” invokes the virtual environment’s Python interpreter and you can run installed scripts without having to use their full path. However, all scripts installed in a virtual environment should be runnable without activating it, and run with the virtual environment’s Python automatically.

The third point is that relying on on $PATH internally to find internal components of a package result in other bugs in certain circumstances, even if the initial entry point is in the $PATH. For instance if a user created a script or program that was coincidentally named the same as one of cylc's entry points, and that script or program was located in the user's $PATH, then cylc would run that script or program rather than its own entry point. This is admittedly not very probable, but not entirely out of the realm of possibility. For instance, a user might create a custom script to run checks on their suite RC file and coincidentally call it cylc-validate, not realizing that cylc tries to execute it accidentally. If the user also happened to have '.' in their $PATH (which is widely discouraged, but nonetheless practiced by some), then the chances of a clash of this sort increase somewhat.

All of that said, I acknowledge that avoiding reliance on $PATH may be more difficult with cylc than it is with other packages, given the need to run tasks on remote hosts that also have cylc installed. If you decide to close this with a WONTFIX or similar I won't have hard feelings about it.

@kinow
Copy link
Member

kinow commented Nov 5, 2020

All of that said, I acknowledge that avoiding reliance on $PATH may be more difficult with cylc than it is with other packages, given the need to run tasks on remote hosts that also have cylc installed. If you decide to close this with a WONTFIX or similar I won't have hard feelings about it.

Hi @jhaiduce thanks a lot for the detailed response (again)!

Now I'm not sure whether this should be closed or not. I am used to always activating environments, so I was probably biased in suggesting that you did that to fix the issue.

But from your explanation, it sounds like this could happen in other cases. There could be some Python third party application that invokes workflow managers, for instance, and that doesn't activate environments. A contrived case, but if that happened they could be using the same argument, and I think that's a valid point.

If your solution fixes this issue, without causing any regression in Cylc, I'd be +1 for that, though @oliver-sanders & @hjoliver know a lot more about Cylc internals, and their use cases, deployment, past issues, etc. So I'll leave it up to them to decide.

Thanks a lot for the PR's/issues, and for your patience too.

Cheers, Bruno

@jhaiduce
Copy link
Contributor Author

jhaiduce commented Nov 5, 2020

If your solution fixes this issue, without causing any regression in Cylc, I'd be +1 for that, though @oliver-sanders & @hjoliver know a lot more about Cylc internals, and their use cases, deployment, past issues, etc. So I'll leave it up to them to decide.

Thanks @kinow. My draft PR #3920 obviously caused some regressions, but in the meantime PR #3899 has been merged and as far as I can tell it addresses my concerns. (Also thanks to @wxtim for working on that one)

@jhaiduce jhaiduce closed this as completed Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

No branches or pull requests

3 participants