Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/refactor cg workflow balsamic #687

Merged
merged 235 commits into from
Sep 2, 2020
Merged

Conversation

Mropat
Copy link
Contributor

@Mropat Mropat commented Jul 8, 2020

This PR will refactor code in Balsamic workflow.

Aims

  • Create a meta-API (BalsamicAnalysisAPI) to handle communication between balsamic and other cg applications. The API will handle the following:
    Calling BalsamicAPI to execute balsamic commands
    Query Lims and StatusDB to decide what arguments to pass to Balsamic
    Read (new version of) deliverables report generated by Balsamic and store bundle in Housekeeper + StatusDB

  • Reduce number of options that can/should be passed to run the workflow. Most of the logic for determining the options will be handled by BalsamicAnalysisAPI.

  • Every command now requires sample family name as argument.

  • No longer support using sample_id to link files for sake of consistency.

  • Write descriptive help annotations for options and commands

Changes

  • Calling BALSAMIC to generate deliverables file now handled by command "cg workflow balsamic report-deliver [CASE_ID]". Previously, this was achieved with cg workflow balsamic deliver report [CASE_ID].

  • Calling cg workflow balsamic [CASE_ID] now prints help and exits instead of initializing full analysis workflow
    Initializing full analysis workflow is done with cg workflow balsamic start [CASE_ID] which will run link, config-case and run commands for CASE_ID

  • Command cg workflow balsamic start-available will start full analysis workflow for all available cases (for future cronjobs)

  • Refactored cg workflow balsamic store command

  • Refactored AnalysisAPI. Now called BalsamicAnalysisAPI to distinguish from AnalysisAPI for MIP

  • Updated cg clean balsamic to utilize BalsamicAnalysisAPI

  • New fixtures and tests for config-case to cover known error scenarios as well all types of cases which should run successfully

  • New fixtures and tests for uploading new format of deliverables to Housekeeper

  • New fixtures and tests for cg clean balsamic

Workflow CLI commands and options

Options

  -a, --analysis-type [qc|paired|single]
                                  Setting this option to qc ensures only QC
                                  analysis is performed

  -p, --priority [low|normal|high]
                                  Job priority in SLURM. Will be set
                                  automatically according to priority i
                                  ClinicalDB,          this option can be used
                                  to override server setting

  --panel-bed TEXT                Panel BED is determined based on capture kit
                                  used for library prep. Set this option to
                                  override the default

  -r, --run-analysis              Execute BALSAMIC in non-dry mode

  -d, --dry-run                   Print command to console without executing

  --help                          Show this message and exit.

Commands

config-case
Query LIMS and StatusDB to get config settings and names of tumor and normal files. Call balsamic using Process while providing the config settings to generate case config file

deliver
Copies files from Housekeeper to customer folder on hasta. This folder is then to be picked up by cronjobs to deliver to actual customer mailbox (UNCHANGED in this PR)

link
Locates FASTQ files for given CASE_ID, concatenates thenḿ and copied to working directory

report-deliver
Finds config files, verifies that analysis is finished, and calls BALSAMIC to create delivery report

run
Calls balsamic run using generated config

start
Calls link, config-case and run for given CASE ID

start-available
Calls start for all cases missing an analysis

store
Calls report-deliver and store-housekeeper for given CASE ID

store-available
Calls store for all cases missing an analysis

store-housekeeper
Parses deliverables report, creates housekeeper bundle and Analysis entry in StatusDB.

How to prepare for test

  • ssh to hasta
  • install branch on stage
  • make sure Clinical Genomics / servers is installed on branch update-bv-4-4-0
  • make sure BALSAMIC => 4.5.0 is installed in S_BALSAMIC on stage

How to test

  • run cg workflow balsamic link [CASE_ID]

  • verify files were successfully linked

  • run cg workflow balsamic config-case [CASE_ID]

  • verify config successfully created

  • run cg workflow balsamic run [CASE_ID] --run-analysis

  • verify jobs successfully submitted

  • verify jobs successfully finished

  • run cg workflow balsamic report-deliver [CASE_ID] once jobs successfully finished

  • verify deliverable report file generated successfully

  • run cg workflow balsamic update-housekeeper [CASE_ID]

  • check if bundle successfully added with housekeeper get -V [CASE_ID], and tags are added

Expected test outcome

  • verify that commands execute properly
  • verify that jobs were submitted with SLURM
  • submitted jobs complete without error
  • Take a screenshot and attach or copy/paste the output.

Review

This version is a:

Thanks for filling in who performed the code review and the test!

Mropat and others added 30 commits June 3, 2020 10:04
…om:Clinical-Genomics/cg into fix/balsamic-command-exectute-from-process
…om:Clinical-Genomics/cg into fix/balsamic-command-exectute-from-process
…om:Clinical-Genomics/cg into fix/balsamic-command-exectute-from-process
@Mropat Mropat mentioned this pull request Aug 17, 2020
14 tasks
@Mropat
Copy link
Contributor Author

Mropat commented Sep 1, 2020

moralgoat whole workflow result
image

@Mropat
Copy link
Contributor Author

Mropat commented Sep 1, 2020

whole workflow result bosssponge (panel single)
image

@Mropat
Copy link
Contributor Author

Mropat commented Sep 1, 2020

workflow result fleetjay (panel paired)
image

@Mropat
Copy link
Contributor Author

Mropat commented Sep 1, 2020

workflow result unitedbeagle (wgs paired)
image

@hassanfa hassanfa changed the title ON HOLD:feat/refactor cg workflow balsamic feat/refactor cg workflow balsamic Sep 2, 2020
@henrikstranneheim
Copy link
Contributor

Do we not need to wait for housekeeper to become pipeline aware before merging or is there a hook in this PR to only store Cases unique to Balsamic and not Balsamic+MIP.

@Mropat
Copy link
Contributor Author

Mropat commented Sep 2, 2020

Do we not need to wait for housekeeper to become pipeline aware before merging or is there a hook in this PR to only store Cases unique to Balsamic and not Balsamic+MIP.

We only analyze + store BALSAMIC samples. The HK update is not needed to merge this

@hassanfa
Copy link
Contributor

hassanfa commented Sep 2, 2020

AMDoc 2046 is also updated. I'll send it to review later today.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Sep 2, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 3 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@moonso moonso merged commit 2501869 into master Sep 2, 2020
@moonso moonso deleted the feat/refactor-cg-workflow-balsamiic branch September 2, 2020 15:08
@moonso
Copy link
Contributor

moonso commented Sep 2, 2020

Skärmavbild 2020-09-02 kl  17 20 52

Merged and deployed on hasta .
Great work @Mropat @hassanfa !!

@Mropat
Copy link
Contributor Author

Mropat commented Sep 2, 2020

Thanks everyone!!!

@patrikgrenfeldt
Copy link
Contributor

deployed on clinical-db:
image

@vwirta
Copy link

vwirta commented Sep 2, 2020

Fantastic work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants