-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4298][Core] - The spark-submit cannot read Main-Class from Manifest. #3561
Conversation
Can one of the admins verify this patch? |
Jenkins, this is ok to test. |
Test build #24150 has started for PR 3561 at commit
|
This sounds good to me, although there's one edge case that I'm curious about: what if my main JAR is hosted in HDFS and has a URI like I wonder how hard it would be to support this for JARs hosted in HDFS, since that might be useful when submitting jobs under cluster deploy mode. I suppose we could try downloading the JAR to read its manifest. @andrewor14, any thoughts here? |
EDIT: mistakenly had |
Test build #24150 has finished for PR 3561 at commit
|
Test FAILed. |
@JoshRosen I'm pretty sure we can definitely support the Also, can you help me understand why the tests failed? I'm seeing:
But that isn't really that helpful and, as with all the talk on the dev distro, I'm just wondering if its the patch that fails or if its a timing / sync issue ( |
It looks like that failure is due to a (known) flaky Spark Streaming test:
I could have Jenkins retest this, but to avoid spam I'll just wait until you push a new commit to handle the |
Hmm, it looks like there's already a JIRA for that particular test's flakiness: SPARK-1600. |
Hey it seems that we can only support this feature when the jar is local (i.e. with the |
Given that we're only supporting the |
We could... but we'll need to re-run the tests anyway after you make your changes. Retest this please |
Test build #24272 has started for PR 3561 at commit
|
Even if we're only going to support the (I think that this was implicit in my comments (and Andrew's), but I just wanted to say it a bit more explicitly to avoid any confusion). |
Test build #24272 has finished for PR 3561 at commit
|
Test PASSed. |
…e' URI to load the main-class
Test build #24380 has started for PR 3561 at commit
|
@JoshRosen and @andrewor14, updated to add a match statement for the URI supporting the |
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a little confused, what is the return type of this block? It seems to me that it's going to be Unit
. Does this actually compile?
…aces, removed { } from case statements, removed return values
@andrewor14 updated per all the relevant requests and retested everything! :) Any other issues? |
Test build #24382 has started for PR 3561 at commit
|
Test build #24380 has finished for PR 3561 at commit
|
Maybe SparkSubmit threw an exception somewhere but it's swallowed in the tests? |
Aha, I found the problem: it looks like tests in SparkSubmitSuite end up calling SparkSubmit.main(), which sets system properties. I added some debug logging: --- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -328,6 +328,7 @@ object SparkSubmit {
}
for ((key, value) <- sysProps) {
+ println(s"Setting system property $key $value")
System.setProperty(key, value)
} Now, when I run SparkSubmitSuite:
A general fix for this issue is to use test fixtures that ensure that system properties which are modified in tests are restored to their old values after the tests finish. I have some code to do this, so I can see about submitting a separate PR to add that fixture. |
I've opened #3739 to try to systematically clean up our explicit usages of |
@JoshRosen great catch! Sounds like this can't be accepted until #3739 is completed, but glad we have a resolution. |
I've merged #3739 into |
Jenkins, retest this please. |
Test build #24924 has started for PR 3561 at commit
|
Test build #24924 has finished for PR 3561 at commit
|
Test FAILed. |
Is |
I'm still investigating; it might be caused by my PR, but it's not failing deterministically in all builds so I'm not sure. I can dig in, but I'm sure it's not caused by this PR. |
Jenkins, retest this please. |
Test build #24938 has started for PR 3561 at commit
|
Test build #24938 has finished for PR 3561 at commit
|
Test PASSed. |
Looks like that's some unrelated test flakiness, since all tests are now passing. Since it doesn't seem like my PR broke any tests, let me go ahead and finish backporting my PR into the maintenance branches. After that, I'll loop back here, perform the minor edits that Andrew and I suggested, and merge this in. (If you have time to do those edits yourself, it'd save me a little bit of time, but not a big deal if you don't get around to it before I merge). |
…original code to above its necessary code segment
@JoshRosen took care of the minor edits for ya! |
Test build #24961 has started for PR 3561 at commit
|
Test build #24961 has finished for PR 3561 at commit
|
Test PASSed. |
I finished my backports of the other patch, so I'm going to merge this now. Thanks! |
…ifest. Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object. Author: Brennon York <brennon.york@capitalone.com> Closes #3561 from brennonyork/SPARK-4298 and squashes the following commits: 5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment 14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class 8d20936 [Brennon York] updated to reset the error message back to the default a043039 [Brennon York] updated to split the uri and jar vals 8da7cbf [Brennon York] fixes SPARK-4298 (cherry picked from commit 8e14c5e) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
…ifest. Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object. Author: Brennon York <brennon.york@capitalone.com> Closes #3561 from brennonyork/SPARK-4298 and squashes the following commits: 5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment 14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class 8d20936 [Brennon York] updated to reset the error message back to the default a043039 [Brennon York] updated to split the uri and jar vals 8da7cbf [Brennon York] fixes SPARK-4298 (cherry picked from commit 8e14c5e) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
…ifest. Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object. Author: Brennon York <brennon.york@capitalone.com> Closes #3561 from brennonyork/SPARK-4298 and squashes the following commits: 5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment 14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class 8d20936 [Brennon York] updated to reset the error message back to the default a043039 [Brennon York] updated to split the uri and jar vals 8da7cbf [Brennon York] fixes SPARK-4298 (cherry picked from commit 8e14c5e) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
Alright, I've merged this to |
Resolves a bug where the
Main-Class
from a .jar file wasn't being read in properly. This was caused by the fact that theprimaryResource
object was a URI and needed to be normalized through a call to.getPath
before it could be passed into theJarFile
object.