Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for imports relative to current location #220

Merged
merged 5 commits into from
Jul 27, 2018

Conversation

geoffjentry
Copy link
Member

Feedback I've received from out in the wild was that this was a massive, massive pain point.

Declare as part of the spec that imports which specify a relative path are relative to the location of the WDL document issuing the import statement.

@yfarjoun
Copy link

yfarjoun commented Jun 5, 2018

If I understand correctly this could solve a huge problem that we currently have with imports!

@@ -760,7 +760,9 @@ Engines should at the very least support the following protocols for import URIs

* `http://` and `https://`
* `file://`
* no protocol (which should be interpreted as `file://`
* No protocol (see below)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably bump the minor version

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of the WDL spec? this is currently against versions/development

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question @yfarjoun - regardless, yes this would be for WDL 1.1.

* no protocol (which should be interpreted as `file://`
* No protocol (see below)

In the event that there is no protocol, if this is an absolute path (i.e. starts with `/`) it is interpreted as a `file://`. Otherwise it is **relative** to the location of the current document. For instance, if the current doc is at `file://foo/bar/baz/qux.wdl` and specificies `import some/task.wdl` this will be resolved to `file://foo/bar/baz/some/task.wdl`. Likewise if the current doc is at `http://www.github.com/openwdl/coolwdls/myWorkflow.wdl` and imports `subworkflow.wdl`, it will be resolved to `http://www.github.com/openwdl/coolwdls/subworkflow.wdl`. It is up to the implementation to provide a mechanism which will allow these imports to be resolved correctly.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about import ../../quux/quuz.wdl? allowed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO as long as that makes sense relative to the "current" URL, I don't see why not (although I would personally still like some scheme like #214 to indicate how to get back to the "import root" without resorting to long trails of ../)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yfarjoun So @cjllanwarne and I have talked about this a bit, and a similar discussion was had in Toronto more generally among Cloud Workstream members (so applying to things beyond WDL as well). The sentiment from the latter (shared by myself) is that people shouldn't use ../ but they're going to so don't explicitly disallow it. IOW make it one of those best practices/linter type of things.

Why shouldn't they? It assumes POSIX-like file structures, and that might fall apart at some point in some arena. However, for practical purposes it will work, so ....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geoffjentry I agree that we cannot explicitly restrict the use of ../ paths. as annoying as that is. People will undoubetly use it, but I like the approach you are taking. we will support but only grudgingly. We can clearly describe in best practices what way users SHOULD do this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I do think that that should belong not in the spec but the proverbial best practices doc(s) we keep handwaving at

* no protocol (which should be interpreted as `file://`
* No protocol (see below)

In the event that there is no protocol, if this is an absolute path (i.e. starts with `/`) it is interpreted as a `file://`. Otherwise it is **relative** to the location of the current document. For instance, if the current doc is at `file://foo/bar/baz/qux.wdl` and specificies `import some/task.wdl` this will be resolved to `file://foo/bar/baz/some/task.wdl`. Likewise if the current doc is at `http://www.github.com/openwdl/coolwdls/myWorkflow.wdl` and imports `subworkflow.wdl`, it will be resolved to `http://www.github.com/openwdl/coolwdls/subworkflow.wdl`. It is up to the implementation to provide a mechanism which will allow these imports to be resolved correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got to believe there's a more sensible / for HTTP?

eg if I were to say import /chrisl/coolwdl/task.wdl from http://www.github.com/openwdl/coolwdls/myWorkflow.wdl I might expect that to look for http://www.github.com/chrisl/coolwdl/task.wdl rather than on the local FS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that a bit. My objection is that in file system semantics /xyz implies an absolute path. Yes, I did just complain about people assuming filesystem semantics above, but this feels like a case where more people will be bit by it than helped.

OTOH I do like the simplicity of what you describe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjllanwarne although what you describe is simple it seems like you could quickly run into a frustrating situation where people have a hodge podge of imports and nested calls that cannot be reasily imported. For example if a restapi exposes a wdl at:
http://www.wdl-host.com/public/my-workflow.wdl
and that wdl imports something from /...., it becomes a bit ambigious as to what the / relative path means. when the workflow was written and then exposed by a restapi it may not have known anything about where it was going to be hosted. Or even if it was going to be hosted at all.

Personally I do not really like any paths that start with / but really everything should be relative to the current WDL (other then http/ftp`) calls. The moment we start allowing absolute paths everything becomes alot harder and is more subject to breaking because of permissions / moved or deleted file

@patmagee
Copy link
Member

patmagee commented Jun 6, 2018

Continuing my comment from above, are absolute paths things that people will really use in the wild? using absolute paths in the context of an execution engine running on a local backend makes sense, to some degree, especially if the user running things controls everything.

The moment the workflow moves away from being in a local setup, however, absolute paths bring a large number of problems, the first of course being absolute to what? absolute to the file system, absolute to the HTTP host, absolute within a zip file. This is all very contextual and seems to introduce a lot of places for failure, whereas always assuming relative paths to the current context would fix this.

For example, allowing absolute paths would allow users to define imports according to their absolute file structure which is not very resuable at all. Relative paths can still lead to this but its more flexible

@LeeTL1220
Copy link

Hello @geoffjentry, @yfarjoun and @cjllanwarne If I may chime in...

One quick point:

  • In the GATK4 repo WDL, we have a real use for ../ for our common tasks. Only by convention do we not use it now.

@patmagee
Copy link
Member

patmagee commented Jun 6, 2018

@LeeTL1220 is there a situation where you would use / as opposed to ../?

@LeeTL1220
Copy link

@patmagee Currently, no.

@geoffjentry
Copy link
Member Author

@patmagee My main concern with explicitly disallowing absolute paths is that it'd be a breaking change as they're explicitly allowed currently. Specifying an import w/o a schema has been specified on becoming file:// and when the host is not included on a file URL the extra slash (ie absolute path) is not optional.

Much like the use of .. I'd prefer to put this in the "please don't do this, but ...." bundle and then look to remove it when we're talking about 2.0 instead of 1.1 (yes, I'm invoking semver based laziness). I suspect in practice it's just not that big of a deal.

@cjllanwarne
Copy link
Contributor

I think it's fairly standard in HTTP to allow / to mean the root of the host site (presumably that's easy enough to test...)

I don't see that as any better or worse than allowing it to mean "relative to my personal FS root" tbh (and maybe a little better if you're doing relative imports within the same site and things are relatively stable - importing workflows between projects on github, for example)

@cjllanwarne
Copy link
Contributor

cjllanwarne commented Jun 12, 2018

For example:

@geoffjentry
Copy link
Member Author

@cjllanwarne "The previous page is sending you to an invalid url (http:////pull/220)."

@cjllanwarne
Copy link
Contributor

@geoffjentry are you opening that from the github page or from an embedded link in an email (which has no URL context?)

@geoffjentry
Copy link
Member Author

@cjllanwarne Good point. And I think you've brought me around on this.

@geoffjentry
Copy link
Member Author

@cjllanwarne @patmagee I made some modifications based on the point Chris was making (which he swung me over to). I still don't think it's the best thing to encourage, but that's another matter altogether IMO.

As Chris said in person, getting the wording correct here will be the tricky part, but at least this is a first stab at it.

@geoffjentry
Copy link
Member Author

Made a few edits based on a conversation with @cjllanwarne - if I don't get any pushback on what's here now I'll put it up for a vote tomorrow

@geoffjentry geoffjentry added the voting active Changes for which voting is active. label Jun 18, 2018
@orodeh
Copy link
Contributor

orodeh commented Jun 18, 2018

👍

7 similar comments
@geetduggal
Copy link

👍

@vdauwera
Copy link

👍

@rhpvorderman
Copy link
Contributor

👍

@DavyCats
Copy link
Contributor

👍

@ffinfo
Copy link
Contributor

ffinfo commented Jun 19, 2018

👍

@patmagee
Copy link
Member

👍

@cjllanwarne
Copy link
Contributor

👍

@aprabhak2
Copy link

👍

@geoffjentry
Copy link
Member Author

By unanimous vote, this passes and is waiting for an implementation

@cjllanwarne
Copy link
Contributor

Now available in Cromwell's develop branch (thanks to broadinstitute/cromwell#3916).

Since the WDL 1.0 spec was not opinionated on this matter I've also allowed this import style in WDL 1.0 workflows.

@geoffjentry geoffjentry removed the awaiting implementation Changes that are awaiting implementation. label Jul 27, 2018
@geoffjentry geoffjentry merged commit f3985f9 into master Jul 27, 2018
@geoffjentry geoffjentry deleted the jg_relative_imports branch July 27, 2018 15:05
@patmagee patmagee added this to the v2.0 milestone Nov 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.