You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi, I am running gusher from inside the braker pipeline. For some reason the first time I ran it on a chromosome-level assembly it worked nicely, but now, running it on a scaffold-level assembly (25K scaffolds) I keep getting the following error when running gush:
> java -jar /scratch/molevo/jmontenegro/software/GUSHR/GeMoMa-1.6.2.jar CLI AnnotationFinalizer u=YES g=genome.fa a=gushr-TIJPJDZUYCXQ/complete_gemoma_like.gff3 i=gushr-TIJPJDZUYCXQ/introns.gff c=UNSTRANDED coverage_unstranded=gushr-TIJPJDZUYCXQ/coverage.bedgraph rename=NO outdir=gushr-TIJPJDZUYCXQ/
jar time stamp: Sat Aug 20 17:22:40 CEST 2022
Searching for the new GeMoMa updates ...
You are using GeMoMa 1.6.2, but the latest version is 1.9.
You can download the latest version from http://www.jstacs.de/index.php/GeMoMa
Parameters of tool "AnnotationFinalizer" (AnnotationFinalizer, version: 1.6.2):
a - annotation (The predicted genome annotation file (GFF)) = gushr-TIJPJDZUYCXQ/complete_gemoma_like.gff3
t - tag (A user-specified tag for transcript predictions in the third column of the returned gff. It might be beneficial to set this to a specific value for some genome browsers., default = prediction) = prediction
u - UTR (allows to predict UTRs using RNA-seq data, range={NO, YES}, default = NO) = YES
No parameters for selection "NO"
Parameters for selection "YES":
g - genome (The genome file (FASTA), i.e., the target sequences in the blast run. Should be in IUPAC code) = genome.fa
The following parameter(s) can be used multiple times:
i - introns file (Introns (GFF), which might be obtained from RNA-seq) = gushr-TIJPJDZUYCXQ/introns.gff
r - reads (if introns are given by a GFF, only use those which have at least this number of supporting split reads, valid range = [1, 2147483647], default = 1) = 1
The following parameter(s) can be used multiple times:
c - coverage file (experimental coverage (RNA-seq), range={NO, UNSTRANDED, STRANDED}, default = NO) = UNSTRANDED
No parameters for selection "NO"
Parameters for selection "UNSTRANDED":
coverage_unstranded - coverage_unstranded (The coverage file contains the unstranded coverage of the genome per interval. Intervals with coverage 0 (zero) can be left out.) = gushr-TIJPJDZUYCXQ/coverage.bedgraph
Parameters for selection "STRANDED":
coverage_forward - coverage_forward (The coverage file contains the forward coverage of the genome per interval. Intervals with coverage 0 (zero) can be left out.) = null
coverage_reverse - coverage_reverse (The coverage file contains the reverse coverage of the genome per interval. Intervals with coverage 0 (zero) can be left out.) = null
rename - rename (allows to generate generic gene and transcripts names (cf. attribute "Name"), range={COMPOSED, SIMPLE, NO}, default = COMPOSED) = NO
Parameters for selection "COMPOSED":
p - prefix (the prefix of the generic name) = null
infix - infix (the infix of the generic name, default = G) = G
s - suffix (the suffix of the generic name, default = 0) = 0
d - digits (the number of informative digits, valid range = [4, 10], default = 5) = 5
di - delete infix (a comma-separated list of infixes that is deleted from the sequence names before building the gene/transcript name, default = ) =
Parameters for selection "SIMPLE":
p - prefix (the prefix of the generic name) = null
d - digits (the number of informative digits, valid range = [4, 10], default = 5) = 5
No parameters for selection "NO"
outdir - The output directory, defaults to the current working directory (.) = gushr-TIJPJDZUYCXQ/
genome parts: 25454 [Seg10865, Seg10864, Seg10863, Seg10862, Seg10869, Seg10868, Seg10867, Seg10866, Seg9583, Seg9584, Seg9585, Seg9586, Seg22850, Seg9580, Seg22851, Seg9581, Seg9582, Seg19202, Seg22843, Seg19201, Seg228...
possible introns from RNA-seq (split reads>=1): 864409
+: 163226
-: 170825
.: 265179
Check RNA-seq data (introns): 48% of the sequences in the reference genome are covered.
#genes: 52801
#warnings: [0, 0]
#predictions: 52801
#warnings: [0, 0]
#CDSs: 237069
#warnings: [0, 0]
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 47118
at projects.gemoma.AnnotationFinalizer.extendUTR(AnnotationFinalizer.java:673)
at projects.gemoma.AnnotationFinalizer.run(AnnotationFinalizer.java:564)
at projects.gemoma.AnnotationFinalizer.run(AnnotationFinalizer.java:444)
at de.jstacs.tools.ui.cli.CLI.run(CLI.java:427)
at projects.gemoma.GeMoMa.main(GeMoMa.java:368)
I am trying to understand what else could be going on here and how to fix it or work around it.
The original braker command was as follows:
hi, I am running gusher from inside the braker pipeline. For some reason the first time I ran it on a chromosome-level assembly it worked nicely, but now, running it on a scaffold-level assembly (25K scaffolds) I keep getting the following error when running gush:
I am trying to understand what else could be going on here and how to fix it or work around it.
The original braker command was as follows:
Any help would be much appreciated.
Regards,
Juan D.
The text was updated successfully, but these errors were encountered: