Pathway-Dotplots is a Python-based tool for generating dot plots from pathway analysis results, with hierarchical clustering support and several plot customization options. The tool is primarily intended for visualizing the enrichment of biological pathways across different samples using data such as p-values, normalized enrichment scores (NES), and more.
- Hierarchical clustering: Cluster pathways based on similarity in enrichment scores.
- Customizable Dotplots: Multiple plot styles including size-based, color-based, gray-scale, and more.
- Input flexibility: Accepts multiple pathway analysis result files in CSV format.
- P-value thresholds: Highlight pathways that pass a significance threshold.
- High-quality output: Generates publication-ready PDF plots.
- Python 3.7+
- Required packages:
matplotlib
,numpy
,pandas
,scipy
You can install the required packages by running:
pip install -r requirements.txt
The tool processes input CSV files containing gene set enrichment analysis results and generates dot plots based on the selected options.
Input .csv files must have columns named pathway, padj and NES (standard fgsea output)
python pathway_dotplots.py -indir /path/to/input/dir -outdir /path/to/output/dir --plot_type size --pvalue_threshold 0.05 --show
-indir: Path to the directory containing input files (in CSV format).
--input_file_mask: Mask for the input files (default: *.csv).
-outdir: Path to the directory where output plots will be saved.
--plot_type: Type of plot to generate.
Options:
- size: Dot size will be proportional to -log10(pvalue).
- invsize: Dot size will be proportional to the enrichment score.
- gray: Dots with p-value greater than the threshold will be plotted in gray.
- white: Dots with p-values below the threshold will be plotted in white.
--pathway_sort: Pathway sorting strategy.
Options:
- cluster: Sort pathways using hierarchical clustering.
- first_enrich: Sort pathways based on the first sample's enrichment score.
- enrich_threshold: Sort pathways using enrichment score and p-value threshold.
--pvalue_threshold: Threshold for p-value (default: 0.1).
--nes_threshold: Threshold for abs(NES) (default 0).
--show: If specified, the plot will be displayed in the GUI.
--imgname: Filename for the output plot (default: dot_plot).
Input Format
Each input file should be a CSV containing pathway enrichment results for a specific sample. The following columns are required:
pathway: The name of the pathway.
NES: Normalized Enrichment Score (or any other enrichment score).
padj: Adjusted p-value.
The tool generates a dot plot in PDF format for each sample, displaying pathways across samples, where the size or color of the dots reflects pathway enrichment or p-value.
Developed by Andrey Loginov in Gaykalova Lab