06 Using Sub Workflows

Available sub workflows

Sub workflows are Snakefiles that can be run on top of the main workflow in BGCFlow. All available workflows can be shown using bgcflow run -h. This subworfklows can be executed by running:

bgcflow run --workflow {workflow name or Snakefile}

As of bgcflow_wrapper v0.3.5, these subworkflows are officially included:

BGC: Do comparative BGC analytics of selected antiSMASH BGC regions
Database: Build a duckdb database. Same as running bgcflow build database
Report: Build a Jupyter notebook markdown reports. Same as running bgcflow build report
Metabase: Serve a Metabase server. Same as running bgcflow serve --metabase
lsagbc: Run a population genetic analysis using lsabgc-easy pipeline
ppanggolin: Build a graph based pangenome and identify region of genome plasticity

Additional subworkflows that will be included in bgcflow v0.8.2:

Alleleome: Run Core-Alleleome to explore and analyze natural sequence variations within the Open Reading Frames (ORFs) of alleles of core genes in a species' pan-genome, both at the amino acid and nucleotide levels (Archana S. Harke et al., 2023). This can be run by providing the path to the Snakefile:

bgcflow run --workflow workflow/Alleleome

Running a comparative BGC workflows

This feature is used when you have a selection of AntiSMASH BGC regions that you want to compare. You might want to run this after finishing the main workflow

Make a new project folder in config/<project_name> for that particular BGCs. You can see the example config format here: https://github.com/NBChub/bgcflow/tree/dev-0.6.1/.examples/lanthipeptide
The samples csv (https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/lanthipeptide/df_antismash_6.1.1_bgc.csv). This can be edited from the previous results table (tables/df_regions_antismash_6.1.1.csv). You then needs to add this two columns:

source (right now just write “bgcflow” as the source)
gbk_path (preferably an absolute path to the antismash BGC region genbank file, you can also use your own BGCs)

You can then create a project config file (https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/lanthipeptide/project_config.yaml). The latest available rules can be seen here: https://github.com/NBChub/bgcflow/blob/dev-0.6.1/workflow/rules_bgc.yaml. Here are the current rules available:

bigslice:
query-bigslice
bigscape
clinker
interproscan
mmseqs2

Add the project to the global config file in config/config.yaml under the bgc_projects variable (see https://github.com/NBChub/bgcflow/blob/dev-0.6.1/.examples/_config_example.yaml#L27-L28): bgc_projects:

  - name: config/<project_name>/project_config.yaml

5.You can then run the subworkflow with e.g.:

bgcflow run --snakefile workflow/BGC -c 2 -n

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06 Using Sub Workflows

Available sub workflows

Running a comparative BGC workflows

Clone this wiki locally