PolyOriginR is a wrapper of PolyOrigin.jl (https://github.com/chaozhi/PolyOrigin.jl), for haplotye reconstruction in polyploid multiparental pouplations.
First download and install Julia (>v1.5,64-bit)available at https://julialang.org/. Then install the development version of PolyOriginR from GitHub with:
# install.packages("devtools")
devtools::install_github("chaozhi/PolyOriginR")
library(PolyOriginR)
polyOriginR("geno.csv","ped.csv",juliapath="C:/path/to/bin/julia.exe",workdir="workdir")
Here juliapath is the absolute path to julia.exe, and workdir is the directory for input and output files. See R script and input files in the "inst/example" folder.
polyOriginR(
genofile,
pedfile,
juliapath = "",
epsilon = 0.01,
seqerr = 0.001,
chrpairing_phase = 22,
chrpairing = 44,
chrsubset = NULL,
snpthin = 1,
nworker = 1,
delsiglevel = 0.05,
maxstuck = 5,
maxiter = 30,
minrun = 3,
maxrun = 10,
byparent = TRUE,
refhapfile = "nothing",
correctthreshold = 0.15,
refinemap = FALSE,
refineorder = FALSE,
maxwinsize = 50,
inittemperature = 4,
coolingrate = 0.5,
stripdis = 20,
maxepsilon = 0.5,
skeletonsize = 50,
isphysmap = FALSE,
recomrate = 1,
isplot = FALSE,
workdir = getwd(),
outstem = "outstem",
verbose = TRUE
)
Argument | Description |
---|---|
genofile |
Input genotypic data for parents and offspring |
pedfile |
Input breeding pedigree |
juliapath |
Path to julia.exe, Default: '' |
epsilon |
genotyping error probability, Default: 0.01 |
seqerr |
sequencing read error probability for GBS data, Default: 0.001 |
chrpairing_phase |
chromosome pairing in parental phasing, with 22 being only bivalent formations and 44 being bi- and quadri-valent formations, Default: 22 |
chrpairing |
chromosome pairing in offspring decoding, with 22 being only bivalent formations and 44 being bivalent and quadrivalent formations, Default: 44 |
chrsubset |
subset of chromosomes, c(1,10) denotes first and tenth chromosomes, and NULL for all chromosomes, Default: NULL |
snpthin |
subset of markers, take every snpthin-th markers, Default: 1 |
nworker |
number of parallel workers for computing among chromosomes, nonparallel if nworker=1, Default: 1 |
delsiglevel |
significance level for deleting markers, Default: 0.05 |
maxstuck |
the max number of consecutive iterations that are rejected in a phasing run, Default: 5 |
maxiter |
the max number of iterations in a phasing run, Default: 30 |
minrun |
if the min number of phasing runs that are at the same local maximimum or have the same parental phases reaches minrun, phasing algorithm will stop before reaching the maxrun, Default: 3 |
maxrun |
the max number of phasing runs, Default: 10 |
byparent |
if TRUE, update parental phases parent by parent, Default: TRUE |
refhapfile |
reference haplotype file for setting absolute parental phases, Default: 'nothing' |
correctthreshold |
a candidate marker is selected for parental error correction if the fraction of offspring genotypic error >= correctthreshold, Default: 0.15 |
refinemap |
if TRUE, refine marker map, Default: FALSE |
refineorder |
if TRUE, refine marker mordering, valid only if refinemap=TRUE, Default: FALSE |
maxwinsize |
max size of sliding windown in map refinning, Default: 50 |
inittemperature |
initial temperature of simulated annealing in map refinning, Default: 4 |
coolingrate |
cooling rate of annealing temperature in map refinning, Default: 0.5 |
stripdis |
a chromosome end in map refinement is removed if it has a distance gap > stripdis(centiMorgan) and it contains less than 5% markers, Default: 20 |
maxepsilon |
markers in map refinement are removed it they have error rates > maxepsilon, Default: 0.5 |
skeletonsize |
the number of markers in the skeleton map that is used to reduce map length inflation by subsampling markers, Default: 50 |
isphysmap |
TRUE, if input marker map in genofile is physical map, Default: FALSE |
recomrate |
Recombination rate in cM/Mpb, Default: 1 |
isplot |
TURE, if plot haploprob, Default: FALSE |
workdir |
Work directory, Default: getwd() |
outstem |
Stem of output files, Default: 'outstem' |
verbose |
TURE, if print details, Default: TRUE |
PolyOriginR is implemented via PolyOriginCmd, which uses PolyOrigin.jl by a command line.
Return 0 if success, and export output files.
Argument | Description |
---|---|
outstem.log |
log file |
outstem_maprefined.csv |
same as input genofile except that marker map being refined |
outstem_parentphased.csv |
same as input genofile exceptthat parents being phased |
outstem_parentphased_corrected.csv |
exported if there exist detected parental errors |
outstem_polyancestry.csv |
genoprob and estimation of valent configurations |
outstem_genoprob.csv |
a simplified version of outstem_polyancestry.csv |
outstem_postdoseprob.csv |
posterior dosage probabilities for all offspring |
outstem_plots |
a folder contains plots of condprob for all offspring if isplot = true |
If you use PolyOriginR in your analyses and publish your results, please cite the article:
Zheng C, Amadeu RR, Munoz PR, and Endelman JB. 2020. Haplotype Reconstruction in Connected Tetraploid F1 Populations. https://www.biorxiv.org/content/10.1101/2020.12.18.423519v1.