Skip to content

A Python library and command-line tool for creating dot-plots for pathways and enrichment analysis.

License

Notifications You must be signed in to change notification settings

l0andr/pathways-dotplot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pathway-Dotplot

Pathway-Dotplots is a Python-based tool for generating dot plots from pathway analysis results, with hierarchical clustering support and several plot customization options. The tool is primarily intended for visualizing the enrichment of biological pathways across different samples using data such as p-values, normalized enrichment scores (NES), and more.

image

Features

  • Hierarchical clustering: Cluster pathways based on similarity in enrichment scores.
  • Customizable Dotplots: Multiple plot styles including size-based, color-based, gray-scale, and more.
  • Input flexibility: Accepts multiple pathway analysis result files in CSV format.
  • P-value thresholds: Highlight pathways that pass a significance threshold.
  • High-quality output: Generates publication-ready PDF plots.

Requirements

  • Python 3.7+
  • Required packages: matplotlib, numpy, pandas, scipy

You can install the required packages by running:

pip install -r requirements.txt

Usage

The tool processes input CSV files containing gene set enrichment analysis results and generates dot plots based on the selected options.

Input .csv files must have columns named pathway, padj and NES (standard fgsea output)

Example command:

python pathway_dotplots.py -indir /path/to/input/dir -outdir /path/to/output/dir --plot_type size --pvalue_threshold 0.05 --show  

Command-line options:

-indir: Path to the directory containing input files (in CSV format).
--input_file_mask: Mask for the input files (default: *.csv).
-outdir: Path to the directory where output plots will be saved.
--plot_type: Type of plot to generate.
Options:

  • size: Dot size will be proportional to -log10(pvalue).
  • invsize: Dot size will be proportional to the enrichment score.
  • gray: Dots with p-value greater than the threshold will be plotted in gray.
  • white: Dots with p-values below the threshold will be plotted in white.

--pathway_sort: Pathway sorting strategy.
Options:

  • cluster: Sort pathways using hierarchical clustering.
  • first_enrich: Sort pathways based on the first sample's enrichment score.
  • enrich_threshold: Sort pathways using enrichment score and p-value threshold.

--pvalue_threshold: Threshold for p-value (default: 0.1).
--nes_threshold: Threshold for abs(NES) (default 0).
--show: If specified, the plot will be displayed in the GUI.
--imgname: Filename for the output plot (default: dot_plot).
Input Format Each input file should be a CSV containing pathway enrichment results for a specific sample. The following columns are required:

pathway: The name of the pathway.
NES: Normalized Enrichment Score (or any other enrichment score).
padj: Adjusted p-value.

Output

The tool generates a dot plot in PDF format for each sample, displaying pathways across samples, where the size or color of the dots reflects pathway enrichment or p-value.

Autor

Developed by Andrey Loginov in Gaykalova Lab

About

A Python library and command-line tool for creating dot-plots for pathways and enrichment analysis.

Topics

Resources

License

Stars

Watchers

Forks

Languages