Skip to content
/ Trek Public

A program to retrieve context-dependent core substitution rate constants (Homo sapiens, germline) for DNA sequences and entire genomes.

Notifications You must be signed in to change notification settings

aleksahak/Trek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

---
pdf_document: default
output: pdf_document
html_document:
  highlight: tango
word_document: default
---

# Trek: a program to retrieve context-dependent core substitution rate constants (Homo sapiens, germline) for DNA sequences and entire genomes

<img src="TrekGUI/Trek_logo.png" align="middle" height="180" width="150" margin="0 auto" />

The Trek methodology is described in detail at <http://dx.doi.org/10.1101/024257>. The name of the package reflects on the fact that we obtained the **Tr**ansposon **E**xposed **k** rate constants (byr$^{-1}$), using the genomic *treks* of retrotransposon remnants in the human genome. The name additionally salutes the 50$^{th}$ anniversary of the Star Trek franchise.

## Installation

**1.** Install the latest version of R programming language or skip to the next step. Step-wise instructions on R installation and upgrade can be found through the following links: [1](http://alexonscience.blogspot.co.uk/2015/10/installing-r-on-linuxunix-machines-from.html), [2](http://alexonscience.blogspot.co.uk/2015/10/upgrade-r-environmentlibraries-mac-osx.html).

**2.** Launch R from the command line and install the R packages *shiny* (required for the graphical user interface), *doMC*, *foreach* and *itertools* (required for a parallel execution of the program) from within R.

```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ R
```
```{r, tidy=TRUE, echo=TRUE, eval=FALSE}
> install.packages("shiny")
> install.packages("doMC")
> install.packages("foreach")
> install.packages("itertools")
```

**3.** Download the Trek source code from the [GitHub](https://github.com/aleksahak/Trek) repository. You can also do that via a Linux/Unix/OSX command line, given that git is installed, by typing the following:

```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ git clone https://github.com/aleksahak/Trek
```

The downloaded folder has the following content:

- **lib/** - the subfolder containing all the source files,
- **TrekGUI/** - the subfolder containing the graphical user interface, 
- **Trek.R** - the interfacing R script used to execute Trek from command line.

**4.** Finally, you need to bit compile the package by going into the **lib/** subfolder and executing **bitcompile.R** code from within R.

```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ cd lib/
$ R
```
```{r, tidy=TRUE, echo=TRUE, eval=FALSE}
> source("bitcompile.R")
```

This generates a single file, **Trek.lib**, which encapsulates the main Trek code and all its dependencies. At this stage, the subfolder **lib/** can be safely removed. You might, however, want to copy the **test.fasta** file from inside **lib/**, in order to test the Trek installation.

At this point, the Trek installation folder should contain:

- **Trek.lib** - the bit-compiled Trek program,
- **TrekGUI/** - the subfolder containing the graphical user interface, 
- **Trek.R** - the interfacing R script used to execute Trek from command line,

and, if the test sequence file is preserved,

- **test.fasta** - the example DNA sequence fasta file.

## Running Trek from R

Trek can be executed as an R program, either from within R, or from the Linux/Unix/OSX command line through R CMD BATCH or Rscript execution. The latter two options allow the usage of Trek from the scripts written via programming languages other than R.

In R, as exemplified in the **Trek.R** interfacing script, one should load the **Trek.lib** bit-compiled file, then execute Trek via the R function *Trek()*. The latter accepts four arguments:

- *FastaFile* - the relative or absolute path to the fasta file to analyse,
- *OutFile* - the relative or absolute path to the output file to be saved,
- *MutRates* - an argument accepting "sym" and "nosym" options for the strand-symmetrised (recommended) and raw parameter usage for substitution rates, 
- *nCPU* - the number of CPUs to be used for the calculation, where the larger values can markedly speed up the mapping process for entire genomes.

Alternatively, the **Trek.R** file can be edited to set the desired arguments, and executed from the command line via R CMD BATCH or Rscript.

## Running Trek from a GUI

Trek features a browser-based graphical user interface (GUI), written with Shiny that can be executed locally on as many CPUs as desired. To launch the GUI, enter the **TrekGUI/** subfolder and double click on **TrekGUI** file. If the file fails to open a browser, make sure that the permissions are correctly set for the file (as executable):

```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ cd TrekGUI/
$ chmod 700 TrekGUI # May require sudo rights.
```

If double clicking on **TrekGUI** does not launch your browser after the above step, then, most probably, your Rscript utility of R is not installed on the default */usr/bin/Rscript* path. To correct the TrekGUI setup, first find out the installed path for Rscript by typing from the command line:

```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ which Rscript
```

then open the **TrekGUI** executable file via a usual plain text editor and correct the Rscript path stated at the first line.

Now double click on **TrekGUI**. As soon as the browser opens, the rest is self explanatory.

## License

You may redistribute the Trek source code (or its components) and/or modify/translate it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 of the License (GPL2). You can find the details of GPL2 in the following link: <http://www.gnu.org/licenses/gpl-2.0.html>.

Questions and bug reports to Aleksandr B. Sahakyan via [as952 *at* cam *dot* ac *dot* uk].

About

A program to retrieve context-dependent core substitution rate constants (Homo sapiens, germline) for DNA sequences and entire genomes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages