Add quartonotebook module #4876

fasterius · 2024-02-07T12:55:42Z

This PR adds a new QUARTONOTEBOOK module, which renders parametrised Quarto documents to HTML. This modules re-uses some code from the RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK modules, namely part of the code that parametrizes the reports. It re-uses some of the test datasets from those modules, but also adds its own test data; Quarto can render R Markdown as well as Jupyter notebooks.

A difference between this module and its notebook-relatives is how the parametrisation is done. This module does not output a parametrised report with the relevant parameters changed from the default, but rather outputs a parametrised params.yml file, which is equivalent to the hidden .params.yml in the other modules. The main reason for this is that Quarto is built with multi-language support (Python, R, Julia and Observable, of which the former two are tested for) and has an option (--execute-params) that allows taking parameters from a YAML file in a language-agnostic way using papermill (JUPYTERNOTEBOOK also uses papermill). The way that the other modules do it yields a preferable output from a user-perspective, but is harder to achieve without language-specific code - I did attempt this, but failed, so if anybody has any good ideas on how to achieve this I'm interested!

PR checklist

Closes #4812

Do not specify using the AMD64 architecture for the docker test profile, as this leads to problems with Pandoc (which Quarto uses) when emulating AMD64 on ARM64 systems. The docker images for this module can be built on both architectures, so always specifying one or the other is not necessary.

This reverts commit 839cae0. Getting PDF output to work turned out to be problematic due to issues with (1) differences between TinyTeX installations on AMD64/ARM64 architectures; and (2) getting Pandoc to work properly inside the docker containers. This might be solvable with a lot more work and troubleshooting, but removing the PDF-functionality for now since HTML is the output type expected to be the one desired by a vast majority of the envisioned module audience; the related RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK modules currently only support HTML output. Another possible solution is to use the new `typst` typesetting system introduced in Quarto 1.4 instead of *TeX, but this would preclude being able to use Conda (which currently doesn't have Quarto 1.4).

Disallow using the Conda or Mamba profiles for the QUARTONOTEBOOK module, as the environment created differs from that created with containers. The Conda version of Quarto does not work on ARM64 architectures due to Pandoc-related issues, but installing outside Conda works in a container-context. It is thus impossible to get the same environment in a container image and using Conda, if compatibility with both AMD64 and ARM64 architectures is desired (which it is). Hopefully the issues with Conda will be solved in the future.

grst

Thanks for adding this! I wish quarto existed when I was working on the jupyternotebook and rmarkdown modules.

Just two minor comments. I also thought there was something broken with the dumping the parameters to yaml in more recent versions of nextflow, but the tests seem to pass.

modules/nf-core/quartonotebook/main.nf

modules/nf-core/quartonotebook/tests/with-parametrization.config

Add the `extension` module input that pipelines can use for Quarto templates. This can be achieved e.g. by adding the `_extensions/` directory with whatever extensions are desired into a pipeline's `assets/` directory and creating a value channel like so: `extensions = Channel.fromPath("[...]/_extensions").collect()`.

fasterius · 2024-02-09T10:21:29Z

As I was working on your comments I realised that I had forgotten to add the ability to use Quarto extensions for the module! I have now added commits for this, as well as outputting the original report to better reflect how RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK works - this one already outputs the parameter YAML for parametrisation, so it felt natural to also give the original report for users who want to edit it directly.

I have previously made a Quarto nf-core extension, which is here: https://github.com/fasterius/nf-core-quarto-template. We are already using it in the spatialtranscriptomics pipeline.

* Add `main.nf` for quartonotebook * Add environment files * Add `meta.yml` * Temporarily change `test_data_base` to for testing * Add bare-bones nf-test test * Abort when running with Conda profile on ARM64 * Add stub test; snapshot all outputs * Add python, rmd and ipynb tests * Add notebook parametrization * Update Conda environment * Add parametrization tests * Add note about the container * Add missing `papermill` mention in tools section * Fix function names according to nf-core convention * Add missing `${args}` to Quarto render command * Change output to `${prefix}.html` * Also allow PDF output; add PDF tests * Fix nf-test configs * Do not specify AMD64 for docker test profile Do not specify using the AMD64 architecture for the docker test profile, as this leads to problems with Pandoc (which Quarto uses) when emulating AMD64 on ARM64 systems. The docker images for this module can be built on both architectures, so always specifying one or the other is not necessary. * Revert "Also allow PDF output; add PDF tests" This reverts commit 839cae0. Getting PDF output to work turned out to be problematic due to issues with (1) differences between TinyTeX installations on AMD64/ARM64 architectures; and (2) getting Pandoc to work properly inside the docker containers. This might be solvable with a lot more work and troubleshooting, but removing the PDF-functionality for now since HTML is the output type expected to be the one desired by a vast majority of the envisioned module audience; the related RMARKDOWNNOTEBOOK and JUPYTERNOTEBOOK modules currently only support HTML output. Another possible solution is to use the new `typst` typesetting system introduced in Quarto 1.4 instead of *TeX, but this would preclude being able to use Conda (which currently doesn't have Quarto 1.4). * Update snapshot * Disallow using the Conda/Mamba profile Disallow using the Conda or Mamba profiles for the QUARTONOTEBOOK module, as the environment created differs from that created with containers. The Conda version of Quarto does not work on ARM64 architectures due to Pandoc-related issues, but installing outside Conda works in a container-context. It is thus impossible to get the same environment in a container image and using Conda, if compatibility with both AMD64 and ARM64 architectures is desired (which it is). Hopefully the issues with Conda will be solved in the future. * Use `nf-core/test-datasets` for test data * Move XDG variable definition to `main.nf` * Add `extensions` input for Quarto templates Add the `extension` module input that pipelines can use for Quarto templates. This can be achieved e.g. by adding the `_extensions/` directory with whatever extensions are desired into a pipeline's `assets/` directory and creating a value channel like so: `extensions = Channel.fromPath("[...]/_extensions").collect()`. * Also output the original report * Update snapshot * Add note regarding disallowing the Conda profile

edmundmiller · 2024-11-16T19:42:01Z

@fasterius any links to the issue with pandoc on arm64 with amd64 emulation? Trying to use Seqera containers in #5561

fasterius · 2024-11-18T09:40:16Z

@fasterius any links to the issue with pandoc on arm64 with amd64 emulation? Trying to use Seqera containers in #5561

I haven't check into it for a long time now, so don't know if the issue is solved now or not. Does it work using Seqera containers?

edmundmiller · 2024-11-18T13:53:17Z

It seems to? The tests pass at least. Let me know if there's something else to look for!

If you don't have any objections I'll merge the other PR and we can report if anyone has any issues.

fasterius force-pushed the quartonotebook branch from 5be5f57 to 49187dc Compare February 7, 2024 19:57

fasterius closed this Feb 7, 2024

fasterius reopened this Feb 7, 2024

fasterius force-pushed the quartonotebook branch from 49187dc to 2f87b27 Compare February 7, 2024 20:00

fasterius added 22 commits February 8, 2024 08:27

Add main.nf for quartonotebook

834a58d

Add environment files

8f02a78

Add meta.yml

1d1db48

Temporarily change test_data_base to for testing

de11af3

Add bare-bones nf-test test

6d741e3

Abort when running with Conda profile on ARM64

71f1b49

Add stub test; snapshot all outputs

01f4187

Add python, rmd and ipynb tests

b635be8

Add notebook parametrization

c6f6e67

Update Conda environment

a8e5553

Add parametrization tests

bd93231

Add note about the container

a77365b

Add missing papermill mention in tools section

68265d5

Fix function names according to nf-core convention

7099f9d

Add missing ${args} to Quarto render command

25b2d83

Change output to ${prefix}.html

fccee7b

Also allow PDF output; add PDF tests

77a0fa2

Fix nf-test configs

0664c56

Update snapshot

2c03fae

fasterius force-pushed the quartonotebook branch from 94acf9b to 6c9a653 Compare February 8, 2024 07:27

fasterius marked this pull request as ready for review February 8, 2024 07:38

fasterius requested a review from a team as a code owner February 8, 2024 07:38

fasterius removed the request for review from a team February 8, 2024 07:38

fasterius requested review from kpadm and grst February 8, 2024 07:38

fasterius mentioned this pull request Feb 8, 2024

Add test dataset for new QUARTONOTEBOOK module nf-core/test-datasets#1089

Merged

Use nf-core/test-datasets for test data

cbf534b

fasterius added new module Adding a new module Ready for Review labels Feb 8, 2024

grst reviewed Feb 8, 2024

View reviewed changes

modules/nf-core/quartonotebook/main.nf Show resolved Hide resolved

modules/nf-core/quartonotebook/tests/with-parametrization.config Outdated Show resolved Hide resolved

fasterius added 4 commits February 9, 2024 09:24

Move XDG variable definition to main.nf

00232dd

Also output the original report

8327d7b

Update snapshot

ee58362

grst approved these changes Feb 9, 2024

View reviewed changes

fasterius added 2 commits February 9, 2024 14:59

Add note regarding disallowing the Conda profile

5a17b86

Merge branch 'master' into quartonotebook

d4328c4

fasterius added this pull request to the merge queue Feb 9, 2024

Merged via the queue into nf-core:master with commit 07ecae3 Feb 9, 2024
10 checks passed

fasterius deleted the quartonotebook branch February 9, 2024 14:15

maxulysse mentioned this pull request Feb 26, 2024

Update data path for all rseqc modules at once #4994

Merged

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add quartonotebook module #4876

Add quartonotebook module #4876

fasterius commented Feb 7, 2024 •

edited

Loading

grst left a comment

fasterius commented Feb 9, 2024

edmundmiller commented Nov 16, 2024

fasterius commented Nov 18, 2024

edmundmiller commented Nov 18, 2024 •

edited

Loading

Add quartonotebook module #4876

Add quartonotebook module #4876

Conversation

fasterius commented Feb 7, 2024 • edited Loading

PR checklist

grst left a comment

Choose a reason for hiding this comment

fasterius commented Feb 9, 2024

edmundmiller commented Nov 16, 2024

fasterius commented Nov 18, 2024

edmundmiller commented Nov 18, 2024 • edited Loading

fasterius commented Feb 7, 2024 •

edited

Loading

edmundmiller commented Nov 18, 2024 •

edited

Loading