Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference isn't being parsed/detected correctly when using CalibrateDragstrModel in parallel mode #8139

Open
1 of 2 tasks
nvnieuwk opened this issue Jan 3, 2023 · 8 comments
Assignees

Comments

@nvnieuwk
Copy link

nvnieuwk commented Jan 3, 2023

Instructions

The github issue tracker is for bug reports, feature requests, and API documentation requests. General questions about how to use the GATK, how to interpret the output, etc. should be asked on the official support forum.

  • Search the existing github issues to see if your issue (or something similar) has already been reported. If the issue already exists, you may comment there to inquire about the progress.
  • Determine whether your issue is a bug report, a feature request, or a documentation request (for tool/class javadoc only -- for forum docs please post there)
  • Consider if your "issue" is better addressed on the GATK forum: http://gatkforums.broadinstitute.org/gatk Post there if you have questions about expected tool behavior, output format, unexpected results, or generally any question that does not fit into the categories above
  • Use a concise yet descriptive title
  • Choose the corresponding template block below and fill it in, replacing or deleting text in italics (surrounded by _) as appropriate
  • Delete the other template blocks and this header.

Bug Report

Affected tool(s) or class(es)

GATK CalibrateDragstrModel

Affected version(s)

  • Latest public release version [4.3.0.0]
  • Latest master branch as of [date of test?]

Description

When running CalibrateDragstrModel in parallel mode, the supplied reference isn't detected correctly causing the following error stack trace:

Using GATK jar /usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx72g -jar /usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar CalibrateDragstrModel --input input.cram --output input.txt --reference hg38.fa --str-table-path hg38.zip --threads 12 --intervals fasta_bed.bed --tmp-dir .
10:24:21.117 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
10:24:21.289 INFO  CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.289 INFO  CalibrateDragstrModel - The Genome Analysis Toolkit (GATK) v4.3.0.0
10:24:21.289 INFO  CalibrateDragstrModel - For support and documentation go to https://software.broadinstitute.org/gatk/
10:24:21.289 INFO  CalibrateDragstrModel - Executing as nvnieuwk on Linux v4.18.0-372.36.1.el8_6.x86_64 amd64
10:24:21.289 INFO  CalibrateDragstrModel - Java runtime: OpenJDK 64-Bit Server VM v11.0.15-internal+0-adhoc..src
10:24:21.289 INFO  CalibrateDragstrModel - Start Date/Time: January 2, 2023 at 10:24:21 AM GMT
10:24:21.289 INFO  CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.289 INFO  CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.290 INFO  CalibrateDragstrModel - HTSJDK Version: 3.0.1
10:24:21.290 INFO  CalibrateDragstrModel - Picard Version: 2.27.5
10:24:21.290 INFO  CalibrateDragstrModel - Built for Spark Version: 2.4.5
10:24:21.290 INFO  CalibrateDragstrModel - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:24:21.290 INFO  CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:24:21.290 INFO  CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:24:21.290 INFO  CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:24:21.290 INFO  CalibrateDragstrModel - Deflater: IntelDeflater
10:24:21.290 INFO  CalibrateDragstrModel - Inflater: IntelInflater
10:24:21.291 INFO  CalibrateDragstrModel - GCS max retries/reopens: 20
10:24:21.291 INFO  CalibrateDragstrModel - Requester pays: disabled
10:24:21.291 INFO  CalibrateDragstrModel - Initializing engine
10:24:21.937 INFO  FeatureManager - Using codec BEDCodec to read file file:///kyukon/scratch/gent/vo/000/gvo00082/vsc44804/nxf.4BNO5qL7DM/fasta_bed.bed
10:24:21.962 INFO  IntervalArgumentCollection - Processing 3217346917 bp from intervals
10:24:22.008 INFO  CalibrateDragstrModel - Running in parallel using the requested number of threads: 12
10:24:22.008 INFO  CalibrateDragstrModel - Done initializing engine
10:24:22.008 INFO  ProgressMeter - Starting traversal
10:24:22.008 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Records Processed   Records/Minute
10:24:32.859 INFO  ProgressMeter -        chr1:26000000              0.2                 59038         326477.4
10:24:42.867 INFO  ProgressMeter -        chr1:83000000              0.3                184245         529998.1
10:24:52.965 INFO  ProgressMeter -       chr1:137193529              0.5                306766         594565.4
10:25:03.307 INFO  ProgressMeter -       chr1:193193529              0.7                428759         622924.6
10:25:13.318 INFO  ProgressMeter -         chr2:3237107              0.9                564835         660497.0
10:25:23.358 INFO  ProgressMeter -        chr2:57237107              1.0                681209         666219.1
10:25:33.392 INFO  ProgressMeter -       chr2:109237107              1.2                799610         672091.8
10:25:44.527 INFO  ProgressMeter -       chr2:177512416              1.4                930822         676805.6
10:25:54.821 INFO  ProgressMeter -       chr2:237512416              1.5               1069999         691712.8
10:26:04.863 INFO  ProgressMeter -        chr3:54999378              1.7               1197525         698570.8
10:26:09.642 INFO  CalibrateDragstrModel - Shutting down engine
[January 2, 2023 at 10:26:09 AM GMT] org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel done. Elapsed time: 1.81 minutes.
Runtime.totalMemory()=47647293440
java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
    at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
    at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
    at org.broadinstitute.hellbender.utils.Utils.runInParallel(Utils.java:1479)
    at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.collectCaseStatsParallel(CalibrateDragstrModel.java:551)
    at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.traverse(CalibrateDragstrModel.java:202)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
    at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
    at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
    at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:678)
    at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:737)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
    at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:558)
    at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$collectCaseStatsParallel$14(CalibrateDragstrModel.java:568)
    at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
    at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
    at htsjdk.samtools.cram.ref.ReferenceSource.getReferenceBasesByRegion(ReferenceSource.java:207)
    at htsjdk.samtools.cram.build.CRAMReferenceRegion.fetchReferenceBasesByRegion(CRAMReferenceRegion.java:169)
    at htsjdk.samtools.cram.structure.Slice.normalizeCRAMRecords(Slice.java:502)
    at htsjdk.samtools.cram.structure.Container.getSAMRecords(Container.java:322)
    at htsjdk.samtools.CRAMIterator.nextContainer(CRAMIterator.java:112)
    at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:204)
    at htsjdk.samtools.CRAMFileReader$CRAMIntervalIteratorBase.getNextRecord(CRAMFileReader.java:589)
    at htsjdk.samtools.CRAMFileReader$CRAMIntervalIteratorBase.initializeIterator(CRAMFileReader.java:562)
    at htsjdk.samtools.CRAMFileReader$CRAMIntervalIterator.<init>(CRAMFileReader.java:620)
    at htsjdk.samtools.CRAMFileReader$CRAMIntervalIterator.<init>(CRAMFileReader.java:615)
    at htsjdk.samtools.CRAMFileReader.query(CRAMFileReader.java:487)
    at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:550)
    at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryOverlapping(SamReader.java:417)
    at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.loadNextIterator(SamReaderQueryingIterator.java:130)
    at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.<init>(SamReaderQueryingIterator.java:69)
    at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:412)
    at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:389)
    at org.broadinstitute.hellbender.engine.ReadsPathDataSource.query(ReadsPathDataSource.java:352)
    at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.readStream(CalibrateDragstrModel.java:915)
    at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$null$11(CalibrateDragstrModel.java:556)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at org.broadinstitute.hellbender.tools.dragstr.InterleavingListSpliterator.forEachRemaining(InterleavingListSpliterator.java:87)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
    at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
    at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
    ... 5 more

However it does work when running the tool single threaded with the exact same options.

Steps to reproduce

I've sadly been unable to create a reproducible example. I've only encountered this with non-public data which I can't share here. I'd be happy to run tests for you though.

@droazen
Copy link
Contributor

droazen commented Jan 3, 2023

@vruano Your thoughts on this one?

@droazen
Copy link
Contributor

droazen commented Jan 3, 2023

@nvnieuwk Can you check the sequence dictionaries for your reference and cram to see what the length of the "HLA-DRB1*04:03:01" contig is reported as? It should be 15246 for the hg38 reference.

You can check the reference sequence dictionary by searching the .dict file for the contig name, and you can check the cram dictionary by inspecting the entry for that contig in the cram header.

@nvnieuwk
Copy link
Author

nvnieuwk commented Jan 4, 2023

In the .dict file it shows the correct length:

@SQ	SN:HLA-DRB1*04:03:01	LN:15246	UR:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa	AS:GRCh38	M5:ce0de8afd561fb1fb0d3acce94386a27	SP:Human

And in the cram header it shows the same length:

@SQ	SN:HLA-DRB1*04:03:01	LN:15246	AH:*	M5:ce0de8afd561fb1fb0d3acce94386a27	UR:/kyukon/data/gent/shared/000/gvo00082/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa

@vruano
Copy link
Contributor

vruano commented Jan 9, 2023

@droazen make sense to check the .fai file and the fasta itself just in case or is .dict guaratee t be the only source for contig lengths at this point.

@vruano
Copy link
Contributor

vruano commented Jan 9, 2023

Looking at the htsjdk code responsible for the original throw (as far as I can see in the stack enclosed in the description) there is a few "smells" in the way synchronized is used or not use ReferenceSource.java. It is likely to be the reason behind the error considering that is failing in multi-thread.

Probably adding synchronized to getReferenceBasesByRegion would fix that. Is a htsjdk issue and not a GATK one. Do you want to add a workaround in GATK or press for a fix and update of the htsjdk dependency. @droazen?

@vruano
Copy link
Contributor

vruano commented Jan 9, 2023

more concretely the private method getReferenceBases(SAMSeqRecord) should be syncronized or avoid it calling directly to the syncronized getReferenceBases(SSR, boolean) and getReferenceBasesByRegions should not update the cache fields.

@nvnieuwk
Copy link
Author

nvnieuwk commented Mar 2, 2023

Hi, any news on this? :)

@drtconway
Copy link

drtconway commented Dec 30, 2024

We're seeing a variation on this:

...
        10:39:41.111 INFO  ProgressMeter - chr19_KI270920v1_alt:87622              5.7               6851956        1196183.1
        10:39:51.631 INFO  ProgressMeter - chr6_GL000254v2_alt:3963588              5.9               6890231        1167140.1
        10:39:59.269 INFO  CalibrateDragstrModel - Shutting down engine
        [December 19, 2024 at 10:39:59 AM AEDT] org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel done. Elapsed time: 6.04 minutes.
        Runtime.totalMemory()=4053794816
        java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Invalid interval. Contig:chrUn_JTFH01000629v1_decoy start:-158 end:-158
                at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
                at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
                at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
                at org.broadinstitute.hellbender.utils.Utils.runInParallel(Utils.java:1479)
                at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.collectCaseStatsParallel(CalibrateDragstrModel.java:551)
                at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.traverse(CalibrateDragstrModel.java:202)
                at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
                at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
                at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
                at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
                at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
                at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
                at org.broadinstitute.hellbender.Main.main(Main.java:289)
        Caused by: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Invalid interval. Contig:chrUn_JTFH01000629v1_decoy start:-158 end:-158
                at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
                at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
                at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
                at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
                at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
                at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:678)
                at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:737)
                at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
                at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
                at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:558)
                at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$collectCaseStatsParallel$14(CalibrateDragstrModel.java:568)
                at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448)
                at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
                at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
                at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
                at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
                at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
        Caused by: java.lang.IllegalArgumentException: Invalid interval. Contig:chrUn_JTFH01000629v1_decoy start:-158 end:-158
                at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:804)
                at org.broadinstitute.hellbender.utils.SimpleInterval.validatePositions(SimpleInterval.java:59)
                at org.broadinstitute.hellbender.utils.SimpleInterval.<init>(SimpleInterval.java:35)
                at org.broadinstitute.hellbender.utils.reference.AbsoluteCoordinates.lambda$toSimpleInterval$1(AbsoluteCoordinates.java:175)
                at org.broadinstitute.hellbender.utils.reference.AbsoluteCoordinates.searchRelative(AbsoluteCoordinates.java:225)
                at org.broadinstitute.hellbender.utils.reference.AbsoluteCoordinates.searchRelative(AbsoluteCoordinates.java:205)
                at org.broadinstitute.hellbender.utils.reference.AbsoluteCoordinates.toRelative(AbsoluteCoordinates.java:191)
                at org.broadinstitute.hellbender.utils.reference.AbsoluteCoordinates.toSimpleInterval(AbsoluteCoordinates.java:175)
                at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$null$11(CalibrateDragstrModel.java:561)
                at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
                at org.broadinstitute.hellbender.tools.dragstr.InterleavingListSpliterator.forEachRemaining(InterleavingListSpliterator.java:87)
                at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
                at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
                at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
                at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
                at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
                at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
                ... 5 more

However in this case, the interval start (and end) is -ve, which should be obviously wrong, I'd have thought.

Update:

Running with no --threads 16 which the original invocation has, succeeds, so it does appear that it may be an instance of the same concurrency problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants