-
Notifications
You must be signed in to change notification settings - Fork 9
06 Using Sub Workflows
Sub workflows are Snakefiles that can be run on top of the main workflow in BGCFlow. All available workflows can be shown using bgcflow run -h
. This subworfklows can be executed by running:
bgcflow run --workflow {workflow name or Snakefile}
As of bgcflow_wrapper
v0.3.5
, these subworkflows are officially included:
- BGC: Do comparative BGC analytics of selected antiSMASH BGC regions
-
Database: Build a
duckdb
database. Same as runningbgcflow build database
-
Report: Build a
Jupyter
notebook markdown reports. Same as runningbgcflow build report
-
Metabase: Serve a Metabase server. Same as running
bgcflow serve --metabase
-
lsagbc: Run a population genetic analysis using
lsabgc-easy
pipeline - ppanggolin: Build a graph based pangenome and identify region of genome plasticity
Additional subworkflows that will be included in bgcflow
v0.8.2
:
- Alleleome: Run Core-Alleleome to explore and analyze natural sequence variations within the Open Reading Frames (ORFs) of alleles of core genes in a species' pan-genome, both at the amino acid and nucleotide levels (Archana S. Harke et al., 2023). This can be run by providing the path to the Snakefile:
bgcflow run --workflow workflow/Alleleome
This feature is used when you have a selection of AntiSMASH BGC regions that you want to compare. You might want to run this after finishing the main workflow
-
Make a new project folder in config/<project_name> for that particular BGCs. You can see the example config format here: https://github.com/NBChub/bgcflow/tree/dev-0.6.1/.examples/lanthipeptide
-
The samples csv (https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/lanthipeptide/df_antismash_6.1.1_bgc.csv). This can be edited from the previous results table (tables/df_regions_antismash_6.1.1.csv). You then needs to add this two columns:
- source (right now just write “bgcflow” as the source)
- gbk_path (preferably an absolute path to the antismash BGC region genbank file, you can also use your own BGCs)
- You can then create a project config file (https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/lanthipeptide/project_config.yaml). The latest available rules can be seen here: https://github.com/NBChub/bgcflow/blob/dev-0.6.1/workflow/rules_bgc.yaml. Here are the current rules available:
- bigslice:
- query-bigslice
- bigscape
- clinker
- interproscan
- mmseqs2
- Add the project to the global config file in config/config.yaml under the bgc_projects variable (see https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/_config_example.yaml#L27-L28): bgc_projects:
- name: config/<project_name>/project_config.yaml
5.You can then run the subworkflow with e.g.:
bgcflow run --snakefile workflow/BGC -c 2 -n