-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for more than one container image #58
Conversation
Question for when I'm back, is this backwards compatible? i.e. are the swagger code clients or the Python (bravado?) client that @jaeddy uses resilient to this change? If so, I'm inclined to approve this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One second thought, both values are optional anyway.
I think this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@denis-yuen in our specification we are using three metrics:
full_tag: quay.io/seqware/seqware_full/1.1
size: Size of the container in bytes
updated: Last time the container was updated.
type: 'Docker' or 'Singularity' or 'Conda'
An example here: http://api.biocontainers.pro/api/v2/tools/sc3-scripts/versions
@ypriverol @susheel |
bumping @ypriverol @susheel |
NB getting the size of a docker container isn't particularly straightforward |
Thanks for the use case @geoffjentry (via GA4Gh chat). My opinion is that such workflow specific data is best left out of the specification. I am biased towards lean specifications to encourage adoption and reduce maintenance. To your point re: multiple API calls for the docker images, once the workflow has been sucked in, the recipient is free to parse the workflow and - if that's how it works - pull all the docker images in at once. I did not understand the part about extra API calls because in my mental model the client is going to pull the workflow document anyway, so it will get this information one way or the other. Thanks! |
For example in our case, we don't actually have this info, hence why I'd prefer it to be optional if anything. |
Driver Project representatives were emailed on 2019-03-08 for NCI GDC, ENA/EVA/EGA, Genomics England, CanDIG, TopMed with request for comments and votes by registered 2019-03-18. HCA input provided via Brian, All Of Us from David Glazer, ELIXIR Cloud from Susheel. |
Allowing for more than one container image will be useful for CanDIG as we plan on leveraging both docker and singularity containers. |
@rdeborj OOI and largely off topic - are you talking about using native singularity containers or singularity to run docker containers? It sounds like the former. Not asking as pushback in any way, just that I haven't come across too many people in the bioinfx world looking to do the former and so when I do I want to make sure I understand their use case more fully. I'm also happy to take this offline (jgentry at broadinstitute dot org) |
Interesting, I had maybe assumed this as something like "my workflow uses n containers rather than 1 container, you can pre-load/cache them all if you want" as opposed to "my workflow can use container X but it may also use container Y as an alternative" @ypriverol could you confirm which use-case you had in mind? |
@rdeborj how will the docker images be mapped for equivalency? A workflow will tend to have multiple images. I still think this is an app property rather than anything else. So, perhaps your runner is able to use multiple images because of the root perms issue (singularity) or because you've found it better for resiliency to rely on multiple docker repositories but then this is more cleanly described - if we consider CWL - in an extension tag that says, for each tool,
|
@geoffjentry I was thinking native singularity usage. Our group out in Vancouver uses singularity on their HPC and the request would satisfy their use. Docker has been a hard no in their environment. @denis-yuen good point, I just assumed most workflows would use more than 1 container and that the questions at hand was based on different supported container types. @kaushik-work can you clarify what you mean by "mapped for equivalency"? |
@rdeborj That's why I was asking. A lot of folks use singularity to run docker containers in those situations, or they would if they realized that that works. OTOH if they're already using native Singularity containers, c'est la vie |
@rdeborj sorry for not being clear: A workflow will contain multiple tools. Each tool will have a set of equivalent docker images. Presumably the tool description will itself have to carry this information. If we duplicate this information in the metadata returned by the API then we should not have just a flat list of images, but possibly mirror the entire data in the workflow/tools, since the images have particular meanings. So If as @geoffjentry suggested this is just to prefetch all images as a pile and sort them out later once the workflow document is received, this organization does not matter, but I'm still not clear on why we need this, since we get this from the workflow document anyway :(. |
I'm happy with the current proposal @denis-yuen We can make optional the tag, size and lat_update |
@kaushik-work
I would agree that it is essentially redundant if the user (code) understands the workflow language which is why I don't want to make it required. However, as with #59 I think part of the desire here is that the user might not understand the language. For example, in that case, the workflow testbed needed different behaviour to send CWL draft-2 workflows to some WES endpoints and CWL 1.0 workflows to others (or it might be WDL, I may be misremembering). @geoffjentry @rdeborj @ypriverol |
@denis-yuen I've said a few times that the suggestions @cjllanwarne made were in the vein of trying to align with a similar-but-not-the same sort of project and thus if they didn't seem to make sense for TRS to feel free to 86 them. So no worries on this side. |
@denis-yuen thanks for the clarification! I think it will work in some cases but I've seen several WDL workflows where the docker image is a variable - i.e. injected at run time, so in the end the client can not rely on this list at least for WDL (for CWL a simple parse of the yaml will yield this information easily). |
Changes from 2.0.0 + CI fix: * Explicitly define type in definitions * Change properties to snakes case and endpoints to camelcase * Move enum to its own definition * Incrementing version due to breaking changes * New endpoint and object definition to describe available files * For issues #21 and #22 * Renamed File definition to ToolFile due to swagger editor * Increment version in comment * Add more file types, better descriptions * Tool checker proposal (#26) From https://docs.google.com/document/d/1PTge27WBOKCiR2MkVSOvWKUwnFk0kx-bjq39aKRQxcY/edit?usp=sharing * Generalize the TRS to account for Singularity and Nextflow * Change toolversion's boolean indicator * Add terms of service placeholder * Install swagger2openapi * Adding operationIds (#33) See dockstore/dockstore#1210 * Moved checker to a property in Tool, also added a boolean * Specify unencoded paths are also allowed * Feature/search (#48) * Add option for search for checker workflows * For #37 , allow meta_version to be optional ga4gh/workflow-execution-service-schemas#6 ga4gh/workflow-execution-service-schemas@337cd42 * Add alias suggestion from #38 * Standardize file fields * Simplify standard (#49) * Feature/link check (#55) * Create CONTRIBUTING.md (#65) * Automatically generate docs for all branches and tags * TRS documentation update (#64) * Update openAPI3.sh * Update BasePath (#72) Updated base path to `basePath: `/ga4gh/trs/v2` * Update ga4gh-tool-discovery.yaml (#74) * Updated LICENCE Copyright (#76) * Include mention to Nextflow (#77) Mentioning nextflow as long the API is supporting workflows based on Nextflow language (NFL) https://github.com/pditommaso/tool-registry-service-schemas/blob/develop/src/main/resources/swagger/openapi.yaml#L741-L744 * Allow for more than one container image (#58) * fix type, update OpenAPI version * fix array examples * Update openapi.yaml * Rename metadata to match GA4GH PAP (#78) * Adding zenodo reference as recommended * Adding contact URL inspired by WES for PAP From security questionaire * More convention sync with cloud workstream APIs (#84) * Working on #82 also see ga4gh/workflow-execution-service-schemas#134 * Working on harmonizing with GA4GH Discovery https://github.com/ga4gh-discovery/service-info * Fix description for #85 * WS-2019-0032 WS-2019-0063 * Attempt to fix tooling (#89) * Update README.md for gitflow (#91) * Simple PRC feedback PR (#92) * Feedback from BOSC * Typo in example (#99) * Remove service-info to avoid breaking change (#101) * Move around a few fields (#97) * test gradle fix for site deployment (#105) * Optional auth (#102) * PRC feedback - strict tool versioning for reproducibility (#93) * Update to match discussion in DRS checksum object added and tweaked to match ga4gh/data-repository-service-schemas#282 * Corrects link to GA4GH . checksum hash algorithm registry (#106) * Correcting hash algorithm registry link * Correcting hash alg registry link * Fixing unrelated link * Glitchy badge (#112) * Update registry.json (#115) * Fix CI (#123) Co-authored-by: Denis Yuen <denis.yuen@oicr.on.ca> Co-authored-by: Yasset Perez-Riverol <ypriverol@gmail.com> Co-authored-by: Chris Llanwarne <cjllanwarne@users.noreply.github.com> Co-authored-by: Susheel Varma <susheel@users.noreply.github.com> Co-authored-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com> Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
No description provided.