-
Notifications
You must be signed in to change notification settings - Fork 0
Computing
To get access to the computing cluster, send Nathan Skene an email with your username and he will add you.
The Imperial CX1 cluster uses PBS as a job manager. PBS has many versions and it cannot neccesarily be assumed that a function you find in an online manual will work exactly as described there. The functions which work on the cluster are best found by typing man qstat
while logged to an interactive session.
There is a weekly HPC clinic. I strongly recommend making use of this. You can turn up and experts will help you. Even if you are just unsure about something, go and speak to them. They are held in South Kensington but it's well worth going.
Imperial regularly runs a beginner's guide to high performance computing course. If you have not previously used HPC you'll want to register for this as soon as possible. This can be done through their website:
Imperial also runs a course on software carpentry. If you are unfamiliar with usage of Git and Linux then you should take this course.
Combiz wrote useful notes on using the HPC (how to login etc):
A version of RStudio is installed on the computing cluster and can be accessed through your browser. This gives you access to a 24 core machine and is probably better than programming on your laptop.
Rather than entering your password every time you ssh
login, you can simply use ssh-keygen to do this once:
ssh-copy-id <username>@login.hpc.ic.ac.uk
This will ask you for your HPC password. Afterwards you will no longer have to enter your password from your local computer. For a more detailed explanation, see here.
If you want an R package installed and there's something that needs root permissions to setup, then request that it be installed via ASK.
You might want to try joining RCS Slack and raising the issue there.
Before doing either of these, you might want to check if the software is listed under module avail
. If it is, you might be able to access it within RStudio using the RLinuxModules package. Another option is to setup the software up within conda, but that won't help with RStudio.
To create an interactive session on the cluster (to avoid overloading the login nodes) use the following command
qsub -I -l select=01:ncpus=8:mem=96gb -l walltime=08:00:00
That command requests the most resources that can be obtained for interactive jobs, decrease these if you can. On the main queue it will take a long time to submit. If I've given you access to the med-bio queue (ask) then you will be better of using the following:
qsub -I -l select=01:ncpus=2:mem=8gb -l walltime=01:00:00 -q med-bio
VPN into the network, then connect with
ssh YOURUSERNAME@login.cx1.hpc.imperial.ac.uk
To setup ssh on your computer for accessing the cluster add the following to ~/.ssh/config:
Host *
AddKeysToAgent yes
IdentityFile ~/.ssh/id_rsa
Host imperial
User nskene
AddKeysToAgent yes
HostName login.cx1.hpc.imperial.ac.uk
ForwardX11Trusted yes
ForwardX11 yes
HostKeyAlgorithms=+ssh-dss
Host imperial-7
User nskene
AddKeysToAgent yes
HostName login-7.cx1.hpc.imperial.ac.uk
ForwardX11Trusted yes
ForwardX11 yes
HostKeyAlgorithms=+ssh-dss
If you use imperial-7 to login then you'll always connect to the same login node which makes using screen/tmux easier.
We have two shared project spaces on the cluster. If you are involved in the DRI Multioics Atlas project then use projects/ukdrimultiomicsprojects/. Otherwise, please use projects/neurogenomics-lab
You will not be able to write into the main directory of either of these. They have two folders: live and ephemeral. Read about the differences between these here:
The MedBio cluster has additional computational resources and is accessed via a seperate queue. Read about it here: https://www.imperial.ac.uk/bioinformatics-data-science-group/resources/uk-med-bio/
We have access to it but I do not have admin rights to grant access to individuals.
To get access it was previously neccesary to email p.blakeley@imperial.ac.uk but he is no longer at Imperial, and the access management doesn't seem to have been cleared up. Another contact email is medbio-help@imperial.ac.uk. Most recently, Brian got access by emailing m.futschik@imperial.ac.uk
.
Nathan can access a list of users that currently have admin rights over medbio through the self-service portal. Ask him to check who is on there currently, and then contact one of them. Currently Abbas Dehghan seems like a good candidate.
To run on the med bio cluster, just put this at the end of your submit commands: -q med-bio
It can take a long time to get jobs submitted on the cluster. We can pay to get jobs submitted faster. Details are here: https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/computing/express-access/. I would rather pay for faster results if this is slowing you down. Let me know if this would be useful and I'll add you to the list. Express jobs are sunmitted using Run express jobs with qsub -q express -P exp-XXXXX, substituting your express account code.
You'll need to use singularity to run docker containers on the HPC. To run a rocker R container in interactive mode, run the following, substituting your username where appropriate:
mkdir /rds/general/user/$USER/ephemeral/tmp/
singularity exec -B /rds/general/user/$USER/ephemeral/tmp/:/tmp,/rds/general/user/$USER/ephemeral/tmp/:/var/tmp,/rds/general/user/$USER/ephemeral/rtmp/:/usr/local/lib/R/site-library/ --writable-tmpfs docker://rocker/tidyverse:latest R
To create a Singularity image, first archive the image into a tar file. Obtain the IMAGE_ID with docker images then archive with (substituting the IMAGE_ID): -
docker save 409ad1cbd54c -o singlecell.tar
On a system running singularity-container (>v3) (e.g. on the HPC cluster), generate the Singularity Image File (SIF) from the local tar file with: -
/usr/bin/singularity build singlecell.sif docker-archive://singlecell.tar
This singlecell.sif Singularity Image File is now ready to use.
On HPC, Rocker containers can be run through Singularity with a single command much like the native Docker commands, e.g.
singularity exec docker://rocker/tidyverse:latest R
By default singularity bind mounts /home/$USER, /tmp, and $PWD into your container at runtime.
More info is available here.
if you go to this link in the compute tab there is your jobs. For some jobs you can extend the walltime.
I recommend the use of TeamViewer to connect. Setup the computer as a saved workstation and it makes remote use almost as easy as being sat at the computer.
- Home
- Useful Info
- To do list for new starters
- Recommended Reading
-
Computing
- Our Private Cloud System
- Cloud Computing
- Docker
- Creating a Bioconductor package
- PBS example scripts for the Imperial HPC
- HPC Issues list
- Nextflow
- Analysing TIP-seq data with the nf-core/cutandrun pipeline
- Shared tools on Imperial HPC
- VSCode
- Working with Google Cloud Platform
- Retrieving raw sequence data from the SRA
- Submitting read data to the European Nucleotide Archive
- R markdown
- Lab software
- Genetics
- Reproducibility
- The Lab Website
- Experimental
- Lab resources
- Administrative stuff