Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TheiaProk] Update amrfinderplus to v3.12.8; DB: v2024-05-02.2; reduce compute resources #514

Merged
merged 4 commits into from
Jun 24, 2024

Conversation

kapsakcj
Copy link
Contributor

@kapsakcj kapsakcj commented Jun 21, 2024

This PR closes #511

🗑️ This dev branch should be deleted after merging to main.

🧠 Aim, Context and Functionality

  • update default docker image for amrfinderplus
  • reduce default compute resources & enable preemptible VMs to be used (cheaper)

🛠️ Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

Workflows impacted

  • TheiaProk_illumina_PE
  • TheiaProk_illumina_PE
  • TheiaProk_ONT
  • TheiaProk_FASTA
  • NCBI-AMRFinderPlus_PHB

📋 Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed: v3.11.20 & DB v2023-09-26.1 ➡️ v3.12.8 & DB v2024-05-02.2

Databases or database versions changed: see above

Data processing/commands changed: double quotes added to one option (does not impact how it runs at all)

File processing changed: none

Compute resources changed: reduced cpus to 2, disk_size to 50, and memory to 8. also allow for preemptible VMs to be used

➡️ Inputs

  • updated default runtime params & docker

⬅️ Outputs

N/A

🧪 Testing

Test Dataset

I use a diverse dataset ILMN PE data of bacteria that are known to have point mutations or other organism-specific acquired resistance genes. Almost every species that is part of the amrfinder --organism flag for organism-specific results. See the data table amrfinderplus_testing_sample for full list of species & accessions.

I have also gathered data (reads or genome assemblies only) for 3 organisms we do not yet have support for (Enterobacter asburiae, vibrio vulnificus, vibrio parahaemolyticus) just to see how they behave.

Commandline Testing with MiniWDL or Cromwell (optional)

Tested WDL task successfully with miniwdl.

Terra Testing

Suggested Scenarios for Reviewer to Test

Test on as diverse a set of species as possible. Good to test any of the TheiaProk workflows.

Theiagen Version Release Testing (optional)

  • Will changes require functional or validation testing (checking outputs etc) during the release? functional, when results are compared to last version of PHB wfs, would be good to check if there are differences
  • Do new samples need to be added to validation datasets? If so, upload these to the appropriate validation workspace Google bucket (). Please describe the new samples here and why these have been chosen. No
  • Are there any output files that should be checked after running the version release testing? No

🔬 Final Developer Checklist

  • The workflow/task has been tested locally and results, including file contents, are as anticipated
  • The workflow/task has been tested on Terra and results, including file contents, are as anticipated
  • The CI/CD has been adjusted and tests are passing (to be completed by Theiagen developer)
  • Code changes follow the style guide

🎯 Reviewer Checklist

  • All impacted workflows/tasks have been tested on Terra with a different dataset than used for development
  • All reviewer-suggested scenarios have been tested and any additional
  • All changed results have been confirmed to be accurate
  • All workflows/tasks impacted by change/s have been tested using a standard validation dataset to ensure no unintended change of functionality
  • All code adheres to the style guide
  • MD5 sums have been updated
  • The PR author has addressed all comments

🗂️ Associated Documentation (to be completed by Theiagen developer)

  • Relevant documentation on the Public Health Resources "PHB Main" has been updated
  • Workflow diagrams have been updated to reflect changes. Not necessary to update diagrams

kapsakcj added 4 commits June 21, 2024 12:15
… to 50; also allow for preemptible instances; upgraded default docker image to latest available version; added double quotes around bash variable for safety; tested successfully with miniwdl
@kapsakcj kapsakcj marked this pull request as ready for review June 21, 2024 19:26
@kapsakcj kapsakcj changed the title Update amrfinderplus to v3.12.8; DB: v2024-05-02.2; reduce compute resources [TheiaProk] Update amrfinderplus to v3.12.8; DB: v2024-05-02.2; reduce compute resources Jun 21, 2024
@michellescribner
Copy link
Contributor

Tested dataset of 88 bacterial isolates representing diverse taxa (set 3 from GAMBIT publication): https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/9838ff2d-d384-412e-bb65-fa9b12af7557

  • All samples completed successfully
  • Confirmed that amrfinderplus version 3.12.8 and 2024-05-02.2 database was used when testing on this dev branch
  • Several AMR core, plus, and virulence genes were different between this branch and PHB v2.0.1
  • Otherwise all other columns examined (assembly_length, gambit_predicted_taxon,number_contigs, etc as shown below) were identical to v2.0.1
    image

Copy link
Contributor

@michellescribner michellescribner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks Curtis!

@sage-wright sage-wright merged commit 6d60ce0 into main Jun 24, 2024
7 checks passed
@kapsakcj kapsakcj deleted the cjk-amrfinderplus-update branch July 12, 2024 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Release] Update amrfinderplus database for June 2024
3 participants