Update: Include `sample_name` IRIDA-Next input column #28

sgsutcliffe · 2024-09-09T17:15:30Z

Modified the template for input samplesheet.csv file to include the sample_name column in addition to sample in-line with changes to IRIDA-Next update as seen with the speciesabundance pipeline. What this means is that the output files and the sample name will be changed to sample_name if a sample_name is called. If staramrnf is being locally then the sample_name can be left blank.

Made a few changes:
- sample_name special characters will be replaced with "-"
- If no sample_name is supplied in the column sample will be used
- To avoid repeat values for sample_name all sample_name values will be suffixed with the index of the input samplesheet.csv
- Tests to check that the variety of different sample_names work with the

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Usage Documentation in docs/usage.md is updated.
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

sgsutcliffe · 2024-09-09T17:20:29Z

A big change was updating the assets/samplsheet.csv and tests/test_samplesheet.csv so I have changed the files to point to the local files rather than the URL in the repository. A temporary change until the PR is merged, then will update to the dev branch URL

apetkau

This looks really great @sgsutcliffe . Thanks so much 😄 .

I have a few small comments and I still have to test this out through IRIDA Next, but it looks really good.

CHANGELOG.md

tests/main.nf.test

CHANGELOG.md

README.md

assets/schema_input.json

docs/usage.md

modules/local/staramr/main.nf

sgsutcliffe · 2024-09-11T13:35:02Z

It has been tested in IRIDA-Next and appears to be working!

kylacochrane

Great job Steven! Thanks for the reminders and suggestions to the sample_name functionality!! 😺 Just a few comments/questions.

CHANGELOG.md

kylacochrane · 2024-09-11T19:56:10Z

workflows/staramr.nf

    ch_input = Channel.fromSamplesheet("input")
+        .map { meta, contigs ->
+            // Remove characters from meta.irida_id that could cause issues in processes
+            meta.irida_id = meta.irida_id.replaceAll(/[;\\#><|]/, '_')


Will changing the irida_id (by replacing characters with '_') affect the ability to map the files and metadata back to the correct IRIDA Next sample on the platform? Specifically, I'm concerned about how this might impact the creation of the iridanext.output.json.gz as iridanext.config relies on irida_id.

It's probably not the best solution -- as you rightly suggest it could in theory cause some headaches. I wanted to avoid a chance of code injection in my bash command that uses meta.irida_id. In the case of staramrnf because we modify the meta.irida_id before any of the files to be included in the iridanext.output.json.gz it doesn't affect the iridanext.output.json.gz. As to running on IRIDA-Next in theory it shouldn't be a problem because the it seems these characters are not used in the IRIDA-Next ID see here but I will need to confirm this 100% before I am done with this PR.

Yeah, echoing that we shouldn't be changing IRIDA IDs as it will cause problems if it (now or later with changes) finds it way back to IRIDA. I'd feel more comfortable putting the restrictions in at the samplesheet level (schema_input.json).

Reverted the change here 1d829b3

assets/schema_input.json

This reverts commit 1d209a8.

apetkau

Thanks so much @sgsutcliffe for all your great work on this. I just have one requested change (and another optional change related to adding a pattern to the JSON Schema for the IRIDA ID).

conf/iridanext.config

apetkau

This looks great. Thanks so much for all your work on this Steven 😄

kylacochrane

Looks great to me Steven!

sgsutcliffe added 2 commits September 9, 2024 13:00

Changed input samplesheet to include sample_name column

5544cf5

Prettier format fix

1e67032

sgsutcliffe requested review from apetkau, kylacochrane and emarinier September 9, 2024 17:20

apetkau requested changes Sep 10, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

tests/main.nf.test Show resolved Hide resolved

emarinier requested changes Sep 10, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

assets/schema_input.json Outdated Show resolved Hide resolved

docs/usage.md Outdated Show resolved Hide resolved

modules/local/staramr/main.nf Show resolved Hide resolved

sgsutcliffe added 2 commits September 11, 2024 09:59

Fix typos and wording in documentation

f6e19a3

Make staramr process more secure

1d209a8

kylacochrane reviewed Sep 11, 2024

View reviewed changes

sgsutcliffe added 2 commits September 13, 2024 10:16

Change naming scheme for duplicate sample_names

b2642b3

Revert "Make staramr process more secure"

1d829b3

This reverts commit 1d209a8.

apetkau requested changes Sep 13, 2024

View reviewed changes

conf/iridanext.config Show resolved Hide resolved

sgsutcliffe added 2 commits September 13, 2024 15:51

Added keep to only include needed columns of metadata

f2b364e

Modified schema to limit the accepted characters for sample

b38c461

sgsutcliffe requested a review from apetkau September 16, 2024 13:00

sgsutcliffe mentioned this pull request Sep 16, 2024

Update: Include sample_name IRIDA-Next input column phac-nml/snvphylnfc#26

Merged

10 tasks

apetkau approved these changes Sep 18, 2024

View reviewed changes

emarinier approved these changes Sep 18, 2024

View reviewed changes

kylacochrane approved these changes Sep 18, 2024

View reviewed changes

sgsutcliffe merged commit 214d654 into dev Sep 18, 2024
4 checks passed

sgsutcliffe deleted the add-sample-name branch September 18, 2024 20:18

sgsutcliffe mentioned this pull request Sep 20, 2024

Update: Include sample_name IRIDA-Next input column phac-nml/arboratornf#23

Merged

9 tasks

sgsutcliffe mentioned this pull request Oct 2, 2024

Update: Add sample_name for IRIDA-Next integration phac-nml/gasnomenclature#30

Merged

9 tasks

sgsutcliffe mentioned this pull request Oct 25, 2024

add sample_name as possible column in samplesheet phac-nml/gasclustering#31

Merged

10 tasks

sgsutcliffe mentioned this pull request Nov 6, 2024

Add sample_name column in samplesheet compatibility phac-nml/fetchdatairidanext#19

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update: Include `sample_name` IRIDA-Next input column #28

Update: Include `sample_name` IRIDA-Next input column #28

sgsutcliffe commented Sep 9, 2024 •

edited

Loading

sgsutcliffe commented Sep 9, 2024

apetkau left a comment

sgsutcliffe commented Sep 11, 2024

kylacochrane left a comment

kylacochrane Sep 11, 2024

sgsutcliffe Sep 12, 2024

emarinier Sep 12, 2024

sgsutcliffe Sep 13, 2024

apetkau left a comment

apetkau left a comment

kylacochrane left a comment

Update: Include sample_name IRIDA-Next input column #28

Update: Include sample_name IRIDA-Next input column #28

Conversation

sgsutcliffe commented Sep 9, 2024 • edited Loading

PR checklist

sgsutcliffe commented Sep 9, 2024

apetkau left a comment

Choose a reason for hiding this comment

sgsutcliffe commented Sep 11, 2024

kylacochrane left a comment

Choose a reason for hiding this comment

kylacochrane Sep 11, 2024

Choose a reason for hiding this comment

sgsutcliffe Sep 12, 2024

Choose a reason for hiding this comment

emarinier Sep 12, 2024

Choose a reason for hiding this comment

sgsutcliffe Sep 13, 2024

Choose a reason for hiding this comment

apetkau left a comment

Choose a reason for hiding this comment

apetkau left a comment

Choose a reason for hiding this comment

kylacochrane left a comment

Choose a reason for hiding this comment

Update: Include `sample_name` IRIDA-Next input column #28

Update: Include `sample_name` IRIDA-Next input column #28

sgsutcliffe commented Sep 9, 2024 •

edited

Loading