-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathINSTALL
140 lines (98 loc) · 5.14 KB
/
INSTALL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
Hello
JAMg has a large number of dependencies but almost all are provided under 3rd_party so you don't have to go looking.
Also everything should install with the make command automatically. However first you need some dependencies.
Installation is pretty straightforward on a Linux system such as Ubuntu. If you're using an OSX then you're on your own
as I never had a Mac, however, if you manage to install and are an expert willing to help others, let me know.
In practice, you do not need to execute anything as root as long as you are able to install the above CPAN modules as a user.
Almost all third party software are provided in 3rd_party and will be installed as the user within the JAMG directory.
The only not installed is emboss which is complicated to install in the user-space (and is useful to have system wide
anyway).
= Variables
Run the env.sh script to generate the environmental variables for installing/using JAMg.
$ bash env.sh
$ source env.source
You can copy and change env.source if you prefer to annotate another genome but always do 'source env.source'
before using a JAMg pipeline software
Also in these documentes, $JAMg_PATH means your full JAMg path (i.e. this directory)
= Perl Dependencies
The following perl modules are required.
* Pod::Usage
* Data::Dumper
* Getopt::Long
* List::Util
* Digest::MD5
* DBI
* DBD::mysql
* IO::File
* LWP::Simple and BioPerl - if you want to download NCBI sequences only
* Statistics::Descriptive
* Storable
* threads and threads::shared
* Carp
For GeneMark:
* Logger::Simple
* Parallel::ForkManager
* Hash::Merge
You can install via CPAN:
$ cpan Pod::Usage Data::Dumper Getopt::Long List::Util Digest::MD5 DBI DBD::mysql \
IO::File LWP::Simple Statistics::Descriptive Storable threads Carp threads::shared
For GeneMark:
$ cpan Logger::Simple Parallel::ForkManager Hash::Merge
Note that you don't have to this as the system administrator: you can use cpan as a normal user
just make sure it is setup correctly (run cpan by itself and follow the prompts)
(that is how we run things on some servers managed Australia-wide rather than in-house)
I recommend that you have a CPAN version of at least 2.0. Ask your system administrator to install the Bundle as root (if it is possible)
as this may save you considerable issues later on. If they refuse, you can STILL install it as a normal user
$ cpan Bundle::CPAN
= GeneMark
If you are to use GeneMark then you have to download and install it manually from:
http://exon.gatech.edu/genemark/license_download.cgi
For convenience, you may place the file at $JAMG_PATH/3rd_party/genemark so th you can execute this file:
$JAMG_PATH/3rd_party/genemark/gmes_petap.pl
You will need to get your key and copy it as per the INSTALL directions.
You will need to install these libraries
$ cpan YAML Hash::Merge Logger::Simple Parallel::ForkManager
= RepeatMasker libraries
Then download the RepeatMasker libraries from GIRI (free registration required; 2 day waiting period) http://www.girinst.org
NB: "This version of RepeatMasker requires library version 20140131 or higher".
decompress repeatmaskerlibraries-*.tar.gz in $JAMG_PATH/3rd_party/RepeatMasker and do
$ cd $JAMG_PATH/3rd_party/RepeatMasker && tar xzf FULL_PATH_TO_repeatmaskerlibraries-*.tar.gz
= OPTIONAL: Install HHblits, BLASTDB etc databases
The HHblits is used for exon searching only. It is very time-consuming and not recommended for large (e.g. plant) genomes
It is not important if you install these databases if you don't use them and you don't have to install them within JAMg,
i.e. you can install them here or anywhere else you prefer, e.g. system-wde if you're the administrator
See
* databases/hhblits/README
* databases/blastdb/README
* databases/ncbi_taxonomy/README
For more rRNA repeat masking:
uncompress databases/blastdb/repeats/*bz2
$ bunzip2 -k databases/blastdb/repeats/*bz2
= Finally, compile 3rd partty software
Run
$ make 2>&1 | tee make.log
This will take some time because there are a lot of third party utilities being made.
== For people using intel compilers:
======================================================================================
Exonerate has issues when installing with intel's compiler ICC.
For that reason, before running make you may need to force the GCC compiler like so:
export CC=`which gcc` # if which returns a path directly e.g. '/usr/bin/gcc'
export CC=`which -p gcc` # if which returns 'gcc is /usr/bin/gcc'
Likewise for ParaFly but C++
export CXX=`which g++`
export CXX=`which -p g++`
and for ParaFly/GMAP/Samtools, you need this:
export CPPFLAGS="$CPPFLAGS -fopenmp"
export LIBPATH=$LDFLAGS
export LDFLAGS="$LDFLAGS -fopenmp"
======================================================================================
If there is another error with the external prorams, I'm not sure if I can help but maybe I've come across the error.
=============================================================================================================
Please report bugs to:
Alexie Papanicolaou
1 Hawkesbury Institute for the Environment, Richmond, NSW, Australia
alexie@butterflybase.org
LICENSE
=======
See LICENSE
Note it is provided "as is" without warranty of any kind.