Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output schema documentation as Markdown #1381

Merged
merged 12 commits into from
Mar 11, 2022
Merged

Output schema documentation as Markdown #1381

merged 12 commits into from
Mar 11, 2022

Conversation

krokicki
Copy link
Contributor

@krokicki krokicki commented Jan 11, 2022

Per discussion in Slack and partially addressing #741, I added a nf-core schema docs command which takes any nextflow_schema.json and generates Markdown documentation for it.

PR checklist

  • This comment contains a description of changes (with reason)
  • CHANGELOG.md is updated
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated

Example Output (eager pipeline)

nf-core/eager pipeline parameters

A fully reproducible and state-of-the-art ancient DNA analysis pipeline

Input/output options

Define where the pipeline should find input data, and additional metadata.

Parameter Description Type Default Required Hidden
input Either paths or URLs to FASTQ/BAM data (must be surrounded with quotes). For paired end data, the path must use '{1,2}' notation to specify read pairs. Alternatively, a path to a TSV file (ending .tsv) containing file paths and sequencing/sample metadata. Allows for merging of multiple lanes/libraries/samples. Please see documentation for template. string
udg_type Specifies whether you have UDG treated libraries. Set to 'half' for partial treatment, or 'full' for UDG. If not set, libraries are assumed to have no UDG treatment ('none'). Not required for TSV input. string none
single_stranded Specifies that libraries are single stranded. Always affects MALTExtract but will be ignored by pileupCaller with TSV input. Not required for TSV input. boolean
single_end Specifies that the input is single end reads. Not required for TSV input. boolean
colour_chemistry Specifies which Illumina sequencing chemistry was used. Used to inform whether to poly-G trim if turned on (see below). Not required for TSV input. Options: 2, 4. integer 4
bam Specifies that the input is in BAM format. Not required for TSV input. boolean

@codecov
Copy link

codecov bot commented Jan 11, 2022

Codecov Report

Merging #1381 (d313939) into dev (389729c) will increase coverage by 0.30%.
The diff coverage is 64.28%.

❗ Current head d313939 differs from pull request most recent head 051d134. Consider uploading reports for the commit 051d134 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1381      +/-   ##
==========================================
+ Coverage   66.72%   67.02%   +0.30%     
==========================================
  Files          50       50              
  Lines        5692     5735      +43     
==========================================
+ Hits         3798     3844      +46     
+ Misses       1894     1891       -3     
Impacted Files Coverage Δ
nf_core/__main__.py 54.82% <33.33%> (+2.09%) ⬆️
nf_core/schema.py 81.36% <81.48%> (+<0.01%) ⬆️
nf_core/modules/lint/module_version.py 80.76% <0.00%> (-3.85%) ⬇️
nf_core/modules/lint/__init__.py 80.82% <0.00%> (-1.37%) ⬇️
nf_core/modules/create.py 80.10% <0.00%> (-0.43%) ⬇️
nf_core/modules/bump_versions.py 67.04% <0.00%> (+0.37%) ⬆️
nf_core/modules/module_utils.py 39.44% <0.00%> (+0.92%) ⬆️
nf_core/download.py 55.44% <0.00%> (+0.96%) ⬆️
nf_core/utils.py 84.50% <0.00%> (+1.00%) ⬆️
nf_core/modules/lint/main_nf.py 81.57% <0.00%> (+1.46%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b32ed1...051d134. Read the comment docs.

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Just tested locally and worked a treat.

Couple of minor requests:

  • Can we get the resulting markdown to pass markdownlint if possible? Or at least these two rules:
    • Headings should be surrounded by blank lines
    • There are some trailing spaces after the final column title: Hidden
  • Might be nice to use inline code backticks a bit more liberally around parameter names and types?

@krokicki
Copy link
Contributor Author

Great suggestions, @ewels! The latest commit should fix those.

@mashehu
Copy link
Contributor

mashehu commented Jan 11, 2022

Nice! 😍
Haven't tested it yet, but to make it closer to the the other command names, could we maybe use nf-core schema export?

@ewels
Copy link
Member

ewels commented Jan 11, 2022

Haven't tested it yet, but to make it closer to the the other command names, could we maybe use nf-core schema export?

But it's not really exporting the schema? It's generating prose documentation from it. So I quite like the current command personally..

@drpatelh
Copy link
Member

drpatelh commented Jan 11, 2022

I guess nf-core schema docs could also be read as generating docs for the schema and not from it. I prefer nf-core schema export because it's quite a common operation performed on other file types/formats when converting from one type to another.

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Tried again and working really nicely now.

Two remaining things I've noticed on this round:

  • The current code will ignore any parameters not within a definition.
    • For nf-core pipelines, everything should be here as we lint for it (I think?). But ungrouped parameters from nf-core schema build will just be in the top level object and need to be handled separately.
    • You can see this logic in various bits of code that work with the schema, for example the website docs code.
  • We're missing some schema fields
    • Most notable is help_text. I appreciate that this is difficult to render in the markdown. Maybe there could be a section at the bottom of the document with all of the help text blocks and the table parameter name could link to the correct heading?
    • Also missing some of the more specialised fields, such as pattern, enum, min, max. Again might make the table too verbose, so not sure what the best thing is to do about these.

Minor things I wondered:

  • Would be nice if we could specify the name / path to a pipeline as well as / instead of a path to a schema file. eg. nf-core schema docs rnaseq, nf-core schema docs nf-core/rnaseq and so on. Some other commands work in this way I think so can hopefully reuse that code.
  • Might be nice to have a built-in functionality to output HTML snippets as well. It's pretty simple, see for example the pipeline markdown_to_html.py scripts. Would then need an additional flag for --format (which can default to markdown).

The direction I was thinking of going in was to have the option to build HTML outputs using a Jinja2 template (with a default supplied or a user-supplied one optionally). Then we could get quite fancy with styling the output and essentially replicate a white-label version of the nf-core website if we want. But this starts to get substantially more complicated and may not be worth the effort. So I'm certainly happy with merge without any of that, can always add it at a later date.

@mashehu
Copy link
Contributor

mashehu commented Jan 12, 2022

To make it easier to work out of the box and make it again a bit more similar to the other nf-core commands, could we make nextflow_schema.json as the default value <pipeline schema>? If somebody would like a different schema they can still specify it, but this is the main schema file name for all nf-core pipelines.

For the help_text field: Maybe adding it inside in a details tag, so it is collapsed on default would be an option? Eager is my go-to pipeline with a lot of markdown formatting inside the help text, to test the markdown inside markdown rendering.

Coming back to nf-core schema export: I like that this would make it quite verbal if we have more output formats, e.g. nf-core schema export --markdown nf-core schema export --html (plus I still find it too inconsistent to have a noun for a sub-command when all the other sub-commands for schema and modules are verbs (or verb_noun)).

@krokicki
Copy link
Contributor Author

@ewels @mashehu Those suggestions sound good but I'm probably not the right person to attempt them right now, given that I haven't actually worked with nf-core pipelines before and I'm not familiar with any of those special constructs.

I'll circle back to this after I have more experience, if no one else has picked it up -- I'm going to attempt to convert some of our Nextflow image analysis pipelines to nf-core at some point soon.

@ewels
Copy link
Member

ewels commented Jan 21, 2022

Just made a start on some of the simpler bits here:

  • Minor code refactoring
  • Added check for existing file, plus --force flag
  • Renamed --markdown to more generic --output
  • Added option to output HTML as well as markdown, new --format flag

I haven't attempted any of the more difficult / important bits yet 😅 I will do if I get time.

@ewels
Copy link
Member

ewels commented Mar 11, 2022

Ok, I think that this is good to merge now. Just added a couple more bits:

  • Render ungrouped parameters
  • Use nextflow_schema.json as a default file path (not as good as taking eg. nf-core/rnaseq but it'll do for now).

There's certainly more that we can do with this, but for now I think it's fine to merge it in and let people play with it. Can add new features to it if there's demand..

Many thanks @krokicki!

Phil

@ewels
Copy link
Member

ewels commented Mar 11, 2022

Failing tests are unrelated to this PR, see #1436 👉🏻 merging anyway.

@ewels ewels merged commit ece5cbf into nf-core:dev Mar 11, 2022
@krokicki krokicki deleted the dev branch March 23, 2022 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants