Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quartonotebook module #4876

Merged
merged 29 commits into from
Feb 9, 2024
Merged

Conversation

fasterius
Copy link
Contributor

@fasterius fasterius commented Feb 7, 2024

This PR adds a new QUARTONOTEBOOK module, which renders parametrised Quarto documents to HTML. This modules re-uses some code from the RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK modules, namely part of the code that parametrizes the reports. It re-uses some of the test datasets from those modules, but also adds its own test data; Quarto can render R Markdown as well as Jupyter notebooks.

A difference between this module and its notebook-relatives is how the parametrisation is done. This module does not output a parametrised report with the relevant parameters changed from the default, but rather outputs a parametrised params.yml file, which is equivalent to the hidden .params.yml in the other modules. The main reason for this is that Quarto is built with multi-language support (Python, R, Julia and Observable, of which the former two are tested for) and has an option (--execute-params) that allows taking parameters from a YAML file in a language-agnostic way using papermill (JUPYTERNOTEBOOK also uses papermill). The way that the other modules do it yields a preferable output from a user-perspective, but is harder to achieve without language-specific code - I did attempt this, but failed, so if anybody has any good ideas on how to achieve this I'm interested!

PR checklist

Closes #4812

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR. (Add test dataset for new QUARTONOTEBOOK module test-datasets#1089)
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Ensure that the test works with either Docker / Singularity.

Do not specify using the AMD64 architecture for the docker test profile,
as this leads to problems with Pandoc (which Quarto uses) when emulating
AMD64 on ARM64 systems. The docker images for this module can be built
on both architectures, so always specifying one or the other is not
necessary.
This reverts commit 839cae0.

Getting PDF output to work turned out to be problematic due to issues
with (1) differences between TinyTeX installations on AMD64/ARM64
architectures; and (2) getting Pandoc to work properly inside the docker
containers. This might be solvable with a lot more work and
troubleshooting, but removing the PDF-functionality for now since HTML
is the output type expected to be the one desired by a vast majority of
the envisioned module audience; the related RMARKDOWNNOTEBOOK and
JUPYTERNOTEBOOK modules currently only support HTML output.

Another possible solution is to use the new `typst` typesetting system
introduced in Quarto 1.4 instead of *TeX, but this would preclude being
able to use Conda (which currently doesn't have Quarto 1.4).
Disallow using the Conda or Mamba profiles for the QUARTONOTEBOOK
module, as the environment created differs from that created with
containers. The Conda version of Quarto does not work on ARM64
architectures due to Pandoc-related issues, but installing outside Conda
works in a container-context. It is thus impossible to get the same
environment in a container image and using Conda, if compatibility with
both AMD64 and ARM64 architectures is desired (which it is). Hopefully
the issues with Conda will be solved in the future.
@fasterius fasterius marked this pull request as ready for review February 8, 2024 07:38
@fasterius fasterius requested a review from a team as a code owner February 8, 2024 07:38
@fasterius fasterius removed the request for review from a team February 8, 2024 07:38
@fasterius fasterius added new module Adding a new module Ready for Review labels Feb 8, 2024
Copy link
Member

@grst grst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! I wish quarto existed when I was working on the jupyternotebook and rmarkdown modules.

Just two minor comments. I also thought there was something broken with the dumping the parameters to yaml in more recent versions of nextflow, but the tests seem to pass.

modules/nf-core/quartonotebook/main.nf Show resolved Hide resolved
Add the `extension` module input that pipelines can use for Quarto
templates. This can be achieved e.g. by adding the `_extensions/`
directory with whatever extensions are desired into a pipeline's
`assets/` directory and creating a value channel like so:
`extensions = Channel.fromPath("[...]/_extensions").collect()`.
@fasterius
Copy link
Contributor Author

As I was working on your comments I realised that I had forgotten to add the ability to use Quarto extensions for the module! I have now added commits for this, as well as outputting the original report to better reflect how RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK works - this one already outputs the parameter YAML for parametrisation, so it felt natural to also give the original report for users who want to edit it directly.

I have previously made a Quarto nf-core extension, which is here: https://github.com/fasterius/nf-core-quarto-template. We are already using it in the spatialtranscriptomics pipeline.

@fasterius fasterius added this pull request to the merge queue Feb 9, 2024
Merged via the queue into nf-core:master with commit 07ecae3 Feb 9, 2024
10 checks passed
@fasterius fasterius deleted the quartonotebook branch February 9, 2024 14:15
jch-13 pushed a commit to jch-13/modules that referenced this pull request Mar 19, 2024
* Add `main.nf` for quartonotebook

* Add environment files

* Add `meta.yml`

* Temporarily change `test_data_base` to for testing

* Add bare-bones nf-test test

* Abort when running with Conda profile on ARM64

* Add stub test; snapshot all outputs

* Add python, rmd and ipynb tests

* Add notebook parametrization

* Update Conda environment

* Add parametrization tests

* Add note about the container

* Add missing `papermill` mention in tools section

* Fix function names according to nf-core convention

* Add missing `${args}` to Quarto render command

* Change output to `${prefix}.html`

* Also allow PDF output; add PDF tests

* Fix nf-test configs

* Do not specify AMD64 for docker test profile

Do not specify using the AMD64 architecture for the docker test profile,
as this leads to problems with Pandoc (which Quarto uses) when emulating
AMD64 on ARM64 systems. The docker images for this module can be built
on both architectures, so always specifying one or the other is not
necessary.

* Revert "Also allow PDF output; add PDF tests"

This reverts commit 839cae0.

Getting PDF output to work turned out to be problematic due to issues
with (1) differences between TinyTeX installations on AMD64/ARM64
architectures; and (2) getting Pandoc to work properly inside the docker
containers. This might be solvable with a lot more work and
troubleshooting, but removing the PDF-functionality for now since HTML
is the output type expected to be the one desired by a vast majority of
the envisioned module audience; the related RMARKDOWNNOTEBOOK and
JUPYTERNOTEBOOK modules currently only support HTML output.

Another possible solution is to use the new `typst` typesetting system
introduced in Quarto 1.4 instead of *TeX, but this would preclude being
able to use Conda (which currently doesn't have Quarto 1.4).

* Update snapshot

* Disallow using the Conda/Mamba profile

Disallow using the Conda or Mamba profiles for the QUARTONOTEBOOK
module, as the environment created differs from that created with
containers. The Conda version of Quarto does not work on ARM64
architectures due to Pandoc-related issues, but installing outside Conda
works in a container-context. It is thus impossible to get the same
environment in a container image and using Conda, if compatibility with
both AMD64 and ARM64 architectures is desired (which it is). Hopefully
the issues with Conda will be solved in the future.

* Use `nf-core/test-datasets` for test data

* Move XDG variable definition to `main.nf`

* Add `extensions` input for Quarto templates

Add the `extension` module input that pipelines can use for Quarto
templates. This can be achieved e.g. by adding the `_extensions/`
directory with whatever extensions are desired into a pipeline's
`assets/` directory and creating a value channel like so:
`extensions = Channel.fromPath("[...]/_extensions").collect()`.

* Also output the original report

* Update snapshot

* Add note regarding disallowing the Conda profile
jennylsmith pushed a commit to RSC-RP/modules that referenced this pull request Mar 20, 2024
* Add `main.nf` for quartonotebook

* Add environment files

* Add `meta.yml`

* Temporarily change `test_data_base` to for testing

* Add bare-bones nf-test test

* Abort when running with Conda profile on ARM64

* Add stub test; snapshot all outputs

* Add python, rmd and ipynb tests

* Add notebook parametrization

* Update Conda environment

* Add parametrization tests

* Add note about the container

* Add missing `papermill` mention in tools section

* Fix function names according to nf-core convention

* Add missing `${args}` to Quarto render command

* Change output to `${prefix}.html`

* Also allow PDF output; add PDF tests

* Fix nf-test configs

* Do not specify AMD64 for docker test profile

Do not specify using the AMD64 architecture for the docker test profile,
as this leads to problems with Pandoc (which Quarto uses) when emulating
AMD64 on ARM64 systems. The docker images for this module can be built
on both architectures, so always specifying one or the other is not
necessary.

* Revert "Also allow PDF output; add PDF tests"

This reverts commit 839cae0.

Getting PDF output to work turned out to be problematic due to issues
with (1) differences between TinyTeX installations on AMD64/ARM64
architectures; and (2) getting Pandoc to work properly inside the docker
containers. This might be solvable with a lot more work and
troubleshooting, but removing the PDF-functionality for now since HTML
is the output type expected to be the one desired by a vast majority of
the envisioned module audience; the related RMARKDOWNNOTEBOOK and
JUPYTERNOTEBOOK modules currently only support HTML output.

Another possible solution is to use the new `typst` typesetting system
introduced in Quarto 1.4 instead of *TeX, but this would preclude being
able to use Conda (which currently doesn't have Quarto 1.4).

* Update snapshot

* Disallow using the Conda/Mamba profile

Disallow using the Conda or Mamba profiles for the QUARTONOTEBOOK
module, as the environment created differs from that created with
containers. The Conda version of Quarto does not work on ARM64
architectures due to Pandoc-related issues, but installing outside Conda
works in a container-context. It is thus impossible to get the same
environment in a container image and using Conda, if compatibility with
both AMD64 and ARM64 architectures is desired (which it is). Hopefully
the issues with Conda will be solved in the future.

* Use `nf-core/test-datasets` for test data

* Move XDG variable definition to `main.nf`

* Add `extensions` input for Quarto templates

Add the `extension` module input that pipelines can use for Quarto
templates. This can be achieved e.g. by adding the `_extensions/`
directory with whatever extensions are desired into a pipeline's
`assets/` directory and creating a value channel like so:
`extensions = Channel.fromPath("[...]/_extensions").collect()`.

* Also output the original report

* Update snapshot

* Add note regarding disallowing the Conda profile
@edmundmiller
Copy link
Contributor

@fasterius any links to the issue with pandoc on arm64 with amd64 emulation? Trying to use Seqera containers in #5561

@fasterius
Copy link
Contributor Author

@fasterius any links to the issue with pandoc on arm64 with amd64 emulation? Trying to use Seqera containers in #5561

I haven't check into it for a long time now, so don't know if the issue is solved now or not. Does it work using Seqera containers?

@edmundmiller
Copy link
Contributor

edmundmiller commented Nov 18, 2024

It seems to? The tests pass at least. Let me know if there's something else to look for!

If you don't have any objections I'll merge the other PR and we can report if anyone has any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new module: quartonotebook
3 participants