-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32804][Launcher] Fix run-example command builder bug #29653
Conversation
6786920
to
baed78f
Compare
baed78f
to
40df6e1
Compare
Are you sure? this already adds the example JAR. You have to specify the name of the example to run. Why would you give the JAR as an arg? |
Currently, when you do spark/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java Lines 215 to 217 in f5360e7
However, So when it (in standalone cluster mode) goes to: spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala Lines 695 to 696 in 32d87c2
We get the error described in [SPARK-32804]:
(But client mode does not have the problem, as it has a different logic.) I would say it's kind of tricky...... |
You shouldn't have to pass the app jar if you are specifying the class name. |
40df6e1
to
2c8c4bb
Compare
I updated my patch code and maybe the new version patch can explain my point better. The following snippet shows that The first unrecognized arg is treated as the primaryResource(use spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala Lines 450 to 470 in f5360e7
Yes, when you do So actually, app-jar arg is always needed in backend. The bug is, the original backend code forgot to add app-jar, and so the first appArg(e.g. you use SparkPi example and set
In the original code,
P.S.
![]() |
Yes, I'm talking about how |
Are you talking about another high-level design issue, not this [SPARK-32804] specific bug? |
I think you're probably right about this, but then how does this work at all? the argument is bogus and ignored in most deployment scenarios? that would make sense. |
I think it only affects run-example in standalone-cluster mode. As standalone-client mode and yarn-cluster/k8s-cluster/... mode have different logics about how to add args and later parse args when do the real launch: spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala Lines 295 to 306 in de44e9c
And for
|
Jenkins test this please |
Test build #128410 has finished for PR 29653 at commit
|
Jenkins retest this please |
Test build #128419 has finished for PR 29653 at commit
|
Yes it's unrelated. I can try again |
Jenkins retest this please |
Test build #128451 has finished for PR 29653 at commit
|
It seems that Something like #29388 ? @HyukjinKwon Can you help let Jenkins retest this? |
Yes these must be unrelated. Don't worry either the test will get fixed and/or it will happen to pass soon. I'll keep an eye on it. |
Jenkins retest this please |
Test build #128569 has finished for PR 29653 at commit
|
Merged to master |
@srowen Thanks! |
i suspect this pullreq causes spark master to fail for me. i see these failures:
and
|
return exampleJar; | ||
} | ||
} | ||
throw new IllegalStateException("Failed to find examples' main app jar."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koertkuipers I think you can fix it like:
if (isTesting) {
SparkLauncher.NO_RESOURCE
} else {
throw new IllegalStateException("Failed to find examples' main app jar.");
}
Are you interested in fixing it as a followup pr via testing as you did?
Yeah makes senses. In CI we build first so the jars will always be available. If we run the tests with compilation like you did, the jars might not be available. |
Oh Thanks! |
@KevinSmile, can you make a followup PR with the reproducible steps @koertkuipers provided above? |
Yes, I'm on it. @HyukjinKwon @koertkuipers Thanks again! followup fix at #29769 |
…test failure without jars ### What changes were proposed in this pull request? It's a followup of #29653. Tests in `SparkSubmitCommandBuilderSuite` may fail if you didn't build first and have jars before test, so if `isTesting` we should set a dummy `SparkLauncher.NO_RESOURCE`. ### Why are the changes needed? Fix tests failure. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? mvn clean test (test without jars built first). Closes #29769 from KevinSmile/bug-fix-master. Authored-by: KevinSmile <kevinwang013@hotmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
What changes were proposed in this pull request?
Bug fix in run-example command builder (as described in [SPARK-32804], run-example failed in standalone-cluster mode):
which will affect
SparkSubmit
in Standalone-Cluster mode:spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Lines 695 to 696 in 32d87c2
and get error at:
spark/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala
Lines 74 to 89 in f556946
Why are the changes needed?
Bug: run-example failed in standalone-cluster mode
Does this PR introduce any user-facing change?
Yes. User can run-example in standalone-cluster mode now.
How was this patch tested?
New ut added.
Also it's a user-facing bug, so better re-check the real case in [SPARK-32804].