Skip to content

Installation Instructions

Daniel Portik edited this page Jan 28, 2021 · 22 revisions

Installing Dependencies

SuperCRUNCH requires installing several dependencies for full functionality, including Python packages and external dependencies. Instructions for installing all dependencies are provided below.

Required Packages

SuperCRUNCH uses several Python packages, including BioPython, numpy, and sqlite3.

It also requires several external dependencies, including Biopython, NCBI-BLAST+, CD-HIT-EST, MAFFT, Muscle, Clustal-O, MACSE, and trimAl.

Installation with conda

Installation of these requirements is easy and fast using conda. The supercrunch-conda-env.yml file can be used to create the correct conda environment:

conda create -f supercrunch-conda-env.yml

The resulting conda environment can then be activated using:

conda activate supercrunch

You can then run all SuperCRUNCH modules in this environment.

NOTE - this will install all requirements except for MACSE, which is a jar file that must be downloaded from here (get V2.05). The location of this jar file must be specified if running the Align.py with the macse option.

Manual Installation

Python packages

You can install BioPython and numpy using tools like pip or brew:

pip install biopython
pip install numpy

You can test that BioPython is installed correctly by first opening python on the command line, and then importing BioPython (the actual Python package is called Bio):

user$ python
Python 3.7.3 (default, Jun 19 2019, 07:38:49) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import Bio
>>>
>>> quit()

If BioPython is installed correctly, nothing obvious will happen on the screen. If it is not installed correctly, you will see an error:

>>> import Bio
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named Bio

The same check can be performed for the Python packages numpy and sqlite3, just to make sure:

$ python
Python 3.7.3 (default, Jun 19 2019, 07:38:49) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import sqlite3
>>> 
>>> quit()

If these checks are passed then everything should run properly.

External Dependencies

SuperCRUNCH also relies on several external dependencies, including:

Download and installation instructions can be found using the above links for each program. With the exception of MACSE, which is a jar file (java executable), all other programs need to be in PATH to work properly for SuperCRUNCH. This can be accomplished by putting all the executables in a directory and setting it in PATH (quick guide here).

NOTE: I encountered problems when I installed the openmp version of cd-hit-est from source, which is used for automated clustering in Cluster_Blast_Extract.py. The clusters created were inconsistent and I could not replicate results across identical runs. This issue was resolved using the regular version of cd-hit-est, which was compiled using 'make openmp=no'. If you install from source I strongly recommend doing the same.

After installation, please check to make sure that the relevant executables are in PATH (type their name on the command line and they should run) and that they have the following names which are case-sensitive (if not, relabel them accordingly). That is, you should be able to type the name in italics on the command line and this action should run the corresponding program. Running the -h option for each program should also reveal the version number.

  • NCBI-BLAST+: blastn, makeblastdb; tested for version 2.10.1

  • CD-HIT-EST: cd-hit-est; tested for version 4.8.1

  • MAFFT: mafft; tested for version 7.475

  • Muscle: muscle; tested for version 3.8.31

  • Clustal-O: clustalo; tested for version 1.2.4

  • trimAl: trimal; tested for version 1.4.1 - 1.4.22

For example, try running:

user$ blastn -h
user$ makeblastdb -h
user$ cd-hit-est -h
user$ mafft -h
user$ muscle -h
user$ clustalo -h
user$ trimal -h

After each of the above commands, the program should run and display the help menu.


Last updated: January, 2021