Skip to content

P2Rank deploy with Docker

Lukas Polak edited this page Jul 3, 2024 · 8 revisions

What are you going to learn here:

  • how to run p2rank locally using Docker image

Prerequisities

  • Install Docker You can check using command:
    docker info
  • WARNING: Note that this build will most likely not work with M1 Apple processors as the HMMer software does not support this processor architecture. See this issue.

Using pre-build version

You can utilize pre-built version of Docker image. In this version the user is set to 5988:5988. This may not bother you when you run the container from Windows. For Linux you need to make sure that the user 5988:5988 can write to shared directories. If you want to use the pre-build version just replace executor-p2rank in all commands with ghcr.io/cusbg/executor-p2rank:main. If you want to build the image yourself or you need to change the user please refer to following section.

Note: If you are working on Windows machine with WSL we advice to use Docker from Windows instead of from WSL.

For example, you can start the Docker image with storing all data to current directory.

# Linux Shell 
docker run --pull=always -v $(pwd):/data/conservation/ -v $(pwd):/data/host -it --rm ghcr.io/cusbg/executor-p2rank:main
# Windows PowerShell
docker run --pull=always -v ${pwd}:/data/conservation/ -v ${pwd}:/data/host -it --rm ghcr.io/cusbg/executor-p2rank:main

After download, Docker starts the container. You can now prepare conservation, just keep in mind that you need enough disc space (see Prepare conservation section).

/opt/hmm-based-conservation/download_database.py

Now you have the image ready and you can run predictions from PDB code or custom file.

Building from source

  • Clone git repository to empty directory.
    git clone https://github.com/cusbg/prankweb.git .
  • Build executor-p2rank Docker image. The image needs a user to run under. You can either use default user 5988:5988 or specify custom user. If you plan to run under Linux you probably want to specify custom user. If you plan to run under Windows the default user should work fine. According to your decision choose one command from following.
    # With default user, recommended for Windows users.
    docker build -t executor-p2rank -f ./executor-p2rank/Dockerfile .
    
    # With custom user, recommended for Linux users.
    docker build --build-arg UID=$(id -u ${USER}) --build-arg GID=$(id -g ${USER}) -t executor-p2rank -f ./executor-p2rank/Dockerfile .

Prepare conservation

This step is optional, it is required only if you plan to use conservation. This tutorial is for HMMER based conservation pipeline.

  • First create a local directory where you can host conservation database. You need around 30GB of free space. We denote the absolute path to this directory as {PATH-TO-CONSERVATION-DIRECTORY}.
  • Now you need to start p2rank Docker image:
    docker run -v {PATH-TO-CONSERVATION-DIRECTORY}:/data/conservation/ -it --rm executor-p2rank
    Now you are inside the container.
  • Run script to download the conservation database. The command is going to download 9.2GB file, so please make sure you have stable and fast internet connection.
    /opt/hmm-based-conservation/download_database.py
  • You may now exit the container using exit command. Alternatively you can run some predictions.

Run the p2rank docker

Now you are ready to enter the p2rank Docker image. The easiest way is to mount the current directory into docker, so the local data can be shared between the Docker container and your host computer. Select and execute one of following command.

To run without conservation stored in host file system.

# Linux Shell.
docker run -v $(pwd):/data/host -it --rm executor-p2rank
# Windows PowerShell
docker run -v -v ${pwd}:/data/host -it --rm executor-p2rank

To run with conservation stored in host file system. Just replace {PATH-TO-CONSERVATION-DIRECTORY} with absolute path to the conservation directory.

# Linux Shell.
docker run -v {PATH-TO-CONSERVATION-DIRECTORY}:/data/conservation/ -v $(pwd):/data/host -it --rm executor-p2rank 
# Windows PowerShell
docker run -v {PATH-TO-CONSERVATION-DIRECTORY}:/data/conservation/ -v ${pwd}:/data/host -it --rm ghcr.io/cusbg/executor-p2rank:main

You can find your data in /data/host directory inside the Docker. It you are running from Linux, keep in mind that file system access right are in place.

You may now proceed with running predictions as described in following sections. Once you are done, you can exit the container using exit command.

Run prediction for PDB ID

  • You should be inside a running Docker container.
  • Navigate to shared data folder.
    cd /data/host
  • Start prediction computation. Remove the --conservation argument if you do not want to use conservation.
    /opt/executor-p2rank/run_p2rank.py --pdb-code 2SRC --output ./2SRC-prediction --conservation
    Once finished you can find you prediction in the 2SRC-prediction directory.

Run prediction for a structure file

  • Make sure the structure file is in shared directory. The one from which you've started the Docker container. We denote {STRUCTURE} relative path to the structure file.
  • Navigate to shared data folder.
    cd /data/host
    You should see your file here.
  • Start prediction computation. Remove the --conservation argument if you do not want to use conservation.
    /opt/executor-p2rank/run_p2rank.py --file {STRUCTURE} --output ./prediction --conservation