[theiaprok] amrfinderplus: add support for Vibrio parahaemolyticus, Vibrio vulnificus, Enterobacter asburiae. Fix C diff bug #542
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR closes #129
🗑️ This dev branch should be deleted after merging to main.
🧠 Aim, Context and Functionality
This PR adds support for using the
amrfinder --organism <organism_name>
option for 3 new organisms (Vv, Vp, and Enterobacter asburiae) and corrects a typo for C. diff.These are organisms that were recently added to NCBI AMRFinderPlus' list of supported organisms that have taxon-specific point mutations or other organism-specific handling of annotation: https://github.com/ncbi/amr/wiki/Running-AMRFinderPlus#--organism-option
When users analyze these samples via the TheiaProk workflows (or the standalone amrfinderplus workflow), AMRFinderPlus will now use the
--organism
flag and be able to detect organism-specific point mutations.The usage of the
amrfinder --organism
option depends on GAMBIT accurately predicting the genus and species correctly, but the user also has the ability to override GAMBIT's results forgambit_predicted_taxon
by manually entering an optional String inputexpected_taxon
for the various theiaprok workflows.Clostridioides difficile
as it is not included in the database (AFAIK)🛠️ Impacted Workflows/Tasks & Changes Being Made
This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes
Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No
Workflows impacted:
📋 Workflow/Task Step Changes
🔄 Data Processing
Docker/software or software versions changed: N/A
Databases or database versions changed: N/A
Data processing/commands changed: added use of
--organism
flag for 3 organism, fixed for 1 (C. diff)File processing changed: N/A
Compute resources changed: N/A
➡️ Inputs
N/A
⬅️ Outputs
Outputs may now include organism-specific point mutations if detected by amrfinderplus.
🧪 Testing
Test Dataset
I tested with either assemblies or Illumina PE datasets for samples that are known to have organism-specific point mutations as reported by NCBI pathogen detection browser.
I also tested with my full amrfinderplus testing dataset that includes diverse taxa, most of which have organism-specific point mutations/acquired genes, to ensure functionality is not lost for these datasets.
Commandline Testing with MiniWDL or Cromwell (optional)
Tested the amrfinderplus task successfully on the command line, but not including output here.
Terra Testing
Will add tests once they are complete
expected_taxon
to"Clostridioides difficile"
: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/93ab7fc4-a36f-4e45-9116-4ae329c9851bgambit_predicted_taxon
: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/2047b9eb-301e-4ece-b9b3-2395c38305ef--organism
flag is used correctlySuggested Scenarios for Reviewer to Test
Test any of the TheiaProk workflows with data from these organisms, or others, to ensure new functionality is working as intended and other organisms are unimpacted by this change
Theiagen Version Release Testing (optional)
🔬 Final Developer Checklist
🎯 Reviewer Checklist
🗂️ Associated Documentation (to be completed by Theiagen developer)