Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override runtime parameters at runtime #298

Closed
illusional opened this issue Mar 27, 2019 · 10 comments
Closed

Override runtime parameters at runtime #298

illusional opened this issue Mar 27, 2019 · 10 comments

Comments

@illusional
Copy link
Contributor

Is there a way to override runtime parameters at runtime, except for explicitly porting them from the inputs?

eg:

{ "myWorkflow.myTask.runtime.disks": "100 GB" }

Or if I did explicitly port them, does (/ should) a value of None in the runtime be override by a backend's default.

@geoffjentry
Copy link
Member

@illusional One of the many, many, many issues I have with runtime is that there's no way to officially do this. It's possible to wire in an input there as any valid WDL expression is viable there, but that requires the WDL to be written that way. You can't do it the way you're suggesting.

One of the other many, many, many issues I have with this is that not everyone sees the current situation as bad :)

I'd personally wholeheartedly support suggested changes to the spec which start to overhaul runtime

@illusional
Copy link
Contributor Author

illusional commented Mar 29, 2019

Can I suggest Specifying / overriding runtime attributes from JSON as a feature to be added to the openwdl spec. Something similar to Specifying Workflow Inputs in JSON, and would follow the same resolution order as #141 proposes (backed up by broadinstitute/cromwell#2446).

Maybe something like:


Specifying / overriding runtime attributes from JSON

Workflow runtime attributes may additionally be specified as key/value pairs within the JSON input. The mapping from JSON or YAML values to WDL values is codified codified similar to the serialization of task inputs section.

In JSON, the user may be able to specify task-specific runtime attributes similar to the following:

{
  "wf.t1.runtime.memory": "16 GB",
  "wf.t2.runtime.cpu": 2,
  "wf.t2.runtime.disks": "100 GB",
  "wf.t3.runtime.arbitrary_key": ["arbitrary", "value"]
}

As the runtime section consists of key/value pairs, it's up to user to ensure they provide the correct coercible type for the backend they are targeting. See the section on Type Coercion for more details.

Resolution Order

Similar to how inputs are resolved, these runtime parameters are resolved in the following order:

  1. Inputs provided by the inputs JSON
  2. The value provided inline on the task, whether that's an expression or literal value.

Edit: cpus to cpu, because the latter is a recognised keyword.

@rhpvorderman
Copy link
Contributor

One of the other many, many, many issues I have with this is that not everyone sees the current situation as bad :)

@geoffjentry I think the current situation with runtime makes it a liability in a cluster environment.
@illusional great addition I would support this if it were a pull request to the spec.

@DavyCats
Copy link
Contributor

@rhpvorderman Could you elaborate on how the current runtime is a liability?

Just throwing this out there, but if runtime is going to be reworked it might also be good to consider if there are any more attributes (besides memory and docker) that should be standardized (eg, disk space, duration, etc.). Probably something for a separate PR though.

@geoffjentry
Copy link
Member

@illusional I like this idea. As with @rhpvorderman if there were a PR against the spec to make this modification I'd support it.

@rhpvorderman I agree but echo @DavyCats in that I'd be curious to hear your specific thoughts (this might not be the right place for that, but 🤷‍♂️ )

@DavyCats I agree, but this is definitely not the right place for that :) Also I've learned a few times over that trying to boil the ocean on fixing runtime isn't going to work, so doing it piecemeal will get more traction.

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Apr 1, 2019

@DavyCats @geoffjentry
Okay going a bit offtopic here, but I do not like the current runtime because:

  • Runtime values cannot be overridden but have to be fed via inputs. I think this is bad design of the current implementation. The runtime values can be specific for a certain environment and should be overridden. This especially important in cluster environments (hence the liability).
  • Runtime values cannot scale with the size of inputs. (We might need to implement something like filesize(File) or dirsize(path) for that to work). BWA and Centrifuge come to mind, where the amount of memory needed is the space taken up by the index. The rest of the stuff does not take much memory at all. This already exists. Whoops!

But once these issues addressed I think the runtime section will be one of the knights in shining armor defending computational biologists from mismanagement of recourses.

I am very glad @illusional has addressed the first issue.

@geoffjentry
Copy link
Member

@rhpvorderman doesn't this do what your'e describing i the second point?

@patmagee
Copy link
Member

patmagee commented Apr 1, 2019

@rhpvorderman it would certainly be helpful for sizing purposes to have some sort of size_all() function, that would compute the total size of all the inputs to the task. I thought we had implemented that at some point, but maybe I am wrong. Other then that though, scaling of runtime values really seems to be a non-trivial task that can only be solved with expressions and custom code. Since there is no universal translation of X file size means scale task by a factor of Y

@rhpvorderman
Copy link
Contributor

@rhpvorderman doesn't this do what your'e describing i the second point?

Yes. Sorry. Did not know this existed. Thanks a lot @geoffjentry !

@patmagee
Copy link
Member

Implemented by #315

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants