GNNsuite

We design and create a framework for benchmarking and comparing Graph Neural Network (GNN) architectures implemented in a robust and reproducible way using the scientific workflow system Nextflow, popular with computational biologists. We include support for nine different GNN architectures on binary node classification tasks. To demonstrate the versatility of our framework, we consider a task of significant biological importance - that of identifying cancer-driver genes (CDG) in a protein-protein interaction (PPI) network. Data was sourced from the Pan-Cancer Analysis of Whole Genomes (PCAWG), the Pathway Indicated Drivers (PID), the COSMIC Cancer Gene Census (COSMIC-CGC), and STRING and BioGRID PPI databases. On this task, GNNs were able to effectively make use of the network structure of the data. Nevertheless, different architectures performed remarkably similar, emphasising the importance of the quality of the training data for such tasks. We make our pipeline publically available to enable other researchers to perform similar investigations into other areas of computational biology. We believe that this will lead to improved benchmarking standards in the GNN literature.

Models

The following models are included:

Graph Convolutional Networks (GCN)
Graph Attention Networks (GAT)
Hierarchical Graph Convolutional Networks (HGCN)
Parallel Hierarchical Graph Convolutional Networks (PHGCN)
Graph SAmpling and aggreGatE (GraphSAGE)
Graph Transformer Networks (GTN)
Graph Isomorphism Networks (GIN)
Graph Convolutional Networks II (GCNII)

Architecture

Running the workflow

Install or update the workflow

nextflow pull stracquadaniolab/gnn-suite

Run a test

nextflow run stracquadaniolab/gnn-suite -profile docker,test

Run an experiment

nextflow run stracquadaniolab/gnn-suite -profile docker,<experiment_file>

The results of the experimetn will be stored in the results/data/<experiment_file>/ and results/figures/<experiment_file>/ directory.

For more information on Nextflow, you can visit the official documentation at nextflow.io/docs.

Docker Image

View the gnn-suite Docker image on GitHub Container Registry, you can also download it using:

docker pull ghcr.io/stracquadaniolab/gnn-suite:latest

Adding a New Experiment

Create a Config File: Create a new configuration file <experiment_file>.config with the parameters for the experiment:

// profile to test the string workflow
params {
  resultsDir = "${baseDir}/results/"
  networkFile = "${baseDir}/data/<network_file>.tsv"
  geneFile = "${baseDir}/data/<feature_file>.csv"
  epochs = [300]
  models = ["gcn2", "gcn", "gat", "gat3h", "hgcn", "phgcn", "sage", "gin", "gtn"]
  replicates = 10
  verbose_interval = 1
  dropout = 0.2
  alpha = 0.1
  theta = 1
  dataSet = "<experiment_file_tag>"
}

Update base.config: Add a new profile for your experiment in base.config:

profiles {
  // existing profiles...

  // test profile for the biogrid cosmic network defining some data
  <config_file> {
    includeConfig '<experiment_file>.config'
  }
}

Run the Experiment: Execute the pipeline with the new profile using:

nextflow run main.nf -profile docker, <experiment_file>

or

nextflow run stracquadaniolab/gnn-suite -profile docker,<experiment_file>

Adding a New Model

Create Model: Implement the new model class in models.py:

class NewModel(torch.nn.Module):
    def __init__(self, num_features, num_classes, num_hidden=16, num_layers=2, dropout=0.5):
        super(NewModel, self).__init__()
        # Define layers
    def forward(self, data):
        # Define forward pass

Import Model: Add your model to the imports in gnn.py:
```
from models import GCN, GAT, ..., NewModel
```

Update build_model: Add your model to the build_model function in gnn.py:

elif name == "new_model":
    return NewModel(num_features, num_classes, dropout=dropout)

Include in Experiment: Add the new model name to the models list in your experiment config (<experiment_file>.config):
```
models = ["gcn", "gat", ..., "new_model"]
```

Hyperparameter Optimization with Optuna

To run the hyperparameter optimization workflow using optuna defined in hyperopt.py, run the hyperparameter optimization workflow:

nextflow run main.nf -profile docker,<experiment_file> -entry hyperopt

The results of the search will be stored in the results/hyperparameters/<experiment_file>/ directory. You can find the best trial information in the best_trial_<model>_<experiment>.txt file.

For more information on optuna, you can visit the official documentation at https://optuna.readthedocs.io.

FAQ

If you encounter the following error message when attempting to execute the script:

Command error:
  .command.sh: line 2: ../gnn-suite/bin/plot.py: Permission denied

You need to grant the necessary execution permissions to the specific python scripts. You can do this by running (e.g. plot.py):

 chmod +x /home/<path>/code/gnn-suite/bin/plot.py

Paper

Forthcoming

Authors

Sebestyén Kamp
Ian Simpson
Giovanni Stracquadanio

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
assets		assets
bin		bin
conf		conf
containers		containers
data		data
testdata		testdata
.bumpversion.cfg		.bumpversion.cfg
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
README.md		README.md
compare.nf		compare.nf
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GNNsuite

Models

Architecture

Running the workflow

Install or update the workflow

Run a test

Run an experiment

Docker Image

Adding a New Experiment

Adding a New Model

Hyperparameter Optimization with Optuna

FAQ

Paper

Authors

About

Releases

Packages

Languages

stracquadaniolab/gnn-suite

Folders and files

Latest commit

History

Repository files navigation

GNNsuite

Models

Architecture

Running the workflow

Install or update the workflow

Run a test

Run an experiment

Docker Image

Adding a New Experiment

Adding a New Model

Hyperparameter Optimization with Optuna

FAQ

Paper

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages