Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseRecalibratorSpark fails on a cluster due to system classloader issue #5979

Merged
merged 2 commits into from
Jun 5, 2019

Conversation

tomwhite
Copy link
Contributor

@tomwhite tomwhite commented Jun 4, 2019

The problem is that Spark executors can't rely on the system classloader to load resources. This change falls back to the current classloader if the resource can't be loaded from the system classloader. I've tested it successfully on a cluster.

@tomwhite
Copy link
Contributor Author

tomwhite commented Jun 4, 2019

@LeeTL1220 would you mind taking a look at this please (it came about from changes in #5941).

@tomwhite
Copy link
Contributor Author

tomwhite commented Jun 4, 2019

Stack trace from failed job:

Caused by: org.broadinstitute.hellbender.exceptions.GATKException: Null value when trying to read system resource.  Cannot find: org/broadinstitute/hellbender/tools/copynumber/utils/annotatedinterval/annotated_region_default.config
	at org.broadinstitute.hellbender.utils.io.Resource.getResourceContentsAsFile(Resource.java:90)
	at org.broadinstitute.hellbender.utils.codecs.AnnotatedIntervalCodec.<init>(AnnotatedIntervalCodec.java:55)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at java.lang.Class.newInstance(Class.java:442)
	at org.broadinstitute.hellbender.engine.FeatureManager.getCandidateCodecsForFile(FeatureManager.java:511)
	at org.broadinstitute.hellbender.engine.FeatureManager.getCodecForFile(FeatureManager.java:464)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.getCodecForFeatureInput(FeatureDataSource.java:324)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:304)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:256)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:230)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:214)
	at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.openFeatureSource(JoinReadsWithVariants.java:63)
	at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.lambda$null$0(JoinReadsWithVariants.java:44)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
	at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.lambda$join$60e5b476$1(JoinReadsWithVariants.java:44)
	at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
	at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

@codecov
Copy link

codecov bot commented Jun 4, 2019

Codecov Report

Merging #5979 into master will not change coverage.
The diff coverage is 50%.

@@             Coverage Diff             @@
##              master     #5979   +/-   ##
===========================================
  Coverage     86.929%   86.929%           
  Complexity     32721     32721           
===========================================
  Files           2013      2013           
  Lines         151306    151306           
  Branches       16610     16610           
===========================================
  Hits          131529    131529           
  Misses         13720     13720           
  Partials        6057      6057
Impacted Files Coverage Δ Complexity Δ
...g/broadinstitute/hellbender/utils/io/Resource.java 62.963% <50%> (ø) 7 <0> (ø) ⬇️

Copy link
Contributor

@LeeTL1220 LeeTL1220 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple comments, please address.

No need to have me re-review unless you want it. So I have approved the review ...

if (systemResourceAsStream == null) {
throw new GATKException("Null value when trying to read system resource. Cannot find: " + resourcePath);
if (resourceAsStream == null) {
resourceAsStream = Resource.class.getClassLoader().getResourceAsStream(resourcePath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just use this statement above and eliminate the if statement?

I.e. replace InputStream resourceAsStream = ClassLoader.getSystemResourceAsStream(resourcePath); with
InputStream resourceAsStream = Resource.class.getClassLoader().getResourceAsStream(resourcePath);

I would think that the ClassLoader would be equivalent to Resource.class.getClassLoader()

Apologies if I am forgetting something....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. A test seems to be failing now, but I'm not sure it's related. I'll investigate.


if (systemResourceAsStream == null) {
throw new GATKException("Null value when trying to read system resource. Cannot find: " + resourcePath);
if (resourceAsStream == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a one-line comment why we need this? "For Spark, etc etc"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@tomwhite
Copy link
Contributor Author

tomwhite commented Jun 4, 2019

Thank you for taking a look @LeeTL1220.

@tomwhite tomwhite merged commit d0d4ca7 into master Jun 5, 2019
@tomwhite tomwhite deleted the tw_resource_classloader_bug branch June 5, 2019 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants