[SPARK-32804][Launcher] Fix run-example command builder bug #29653

KevinSmile · 2020-09-05T14:12:59Z

What changes were proposed in this pull request?

Bug fix in run-example command builder (as described in [SPARK-32804], run-example failed in standalone-cluster mode):

Missing primaryResource arg.
Wrong appResource arg.

which will affect SparkSubmit in Standalone-Cluster mode:

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Lines 695 to 696 in 32d87c2

    
           childArgs += "launch" 
        
           childArgs += (args.master, args.primaryResource, args.mainClass)

and get error at:

spark/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala

Lines 74 to 89 in f556946

    
           case "launch" :: _master :: _jarUrl :: _mainClass :: tail => 
        
             cmd = "launch" 
        
             if (!ClientArguments.isValidJarUrl(_jarUrl)) { 
        
               // scalastyle:off println 
        
               println(s"Jar url '${_jarUrl}' is not in valid format.") 
        
               println(s"Must be a jar file path in URL format " + 
        
                 "(e.g. hdfs://host:port/XX.jar, file:///XX.jar)") 
        
               // scalastyle:on println 
        
               printUsageAndExit(-1) 
        
             } 
        
             jarUrl = _jarUrl 
        
             masters = Utils.parseStandaloneMasterUrls(_master) 
        
             mainClass = _mainClass 
        
             _driverOptions ++= tail

Why are the changes needed?

Bug: run-example failed in standalone-cluster mode

Does this PR introduce any user-facing change?

Yes. User can run-example in standalone-cluster mode now.

How was this patch tested?

New ut added.
Also it's a user-facing bug, so better re-check the real case in [SPARK-32804].

srowen · 2020-09-06T14:50:31Z

Are you sure? this already adds the example JAR. You have to specify the name of the example to run. Why would you give the JAR as an arg?

KevinSmile · 2020-09-06T15:29:55Z

@srowen

Currently, when you do run-example, SparkSubmitCommandBuilder will auto add --jars (set to all jars in examples/jars folder) for you.

spark/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java

Lines 215 to 217 in f5360e7

    
           if (isExample) { 
        
             jars.addAll(findExamplesJars()); 
        
           }

However, --jars is different from <application-jar> (aka primaryResource, which contains your Main-Class) .

So when it (in standalone cluster mode) goes to:

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Lines 695 to 696 in 32d87c2

    
           childArgs += "launch" 
        
           childArgs += (args.master, args.primaryResource, args.mainClass)

spark/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala

Lines 74 to 89 in f556946

    
           case "launch" :: _master :: _jarUrl :: _mainClass :: tail => 
        
             cmd = "launch" 
        
             if (!ClientArguments.isValidJarUrl(_jarUrl)) { 
        
               // scalastyle:off println 
        
               println(s"Jar url '${_jarUrl}' is not in valid format.") 
        
               println(s"Must be a jar file path in URL format " + 
        
                 "(e.g. hdfs://host:port/XX.jar, file:///XX.jar)") 
        
               // scalastyle:on println 
        
               printUsageAndExit(-1) 
        
             } 
        
             jarUrl = _jarUrl 
        
             masters = Utils.parseStandaloneMasterUrls(_master) 
        
             mainClass = _mainClass 
        
             _driverOptions ++= tail

We get the error described in [SPARK-32804]:

Missing primaryResource arg. ====> aka <application-jar>
Wrong appResource arg. ====> "spark-internal" should not be set as an arg in run-example
Actually, in [SPARK-32804], "spark-internal" is treated as primaryResource

(But client mode does not have the problem, as it has a different logic.)

I would say it's kind of tricky......

KevinSmile · 2020-09-06T15:33:56Z

Or we can check this doc, as run-example will use spark-submit:

srowen · 2020-09-06T18:08:03Z

You shouldn't have to pass the app jar if you are specifying the class name.
I can't reproduce it -- except on standalone + cluster mode. I actually am surprised that is a valid combination.
I take your point here, that there does not seem to be a good reason to put the dummy 'spark-internal' value on the command line. How about just omitting it if it has this special value?

KevinSmile · 2020-09-07T04:46:23Z

I updated my patch code and maybe the new version patch can explain my point better.

The following snippet shows that The first unrecognized arg is treated as the primaryResource(use Utils.resolveURI(opt).toString to get app-jar in correct format ):

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala

Lines 450 to 470 in f5360e7

    
             /** 
        
              * Handle unrecognized command line options. 
        
              * 
        
              * The first unrecognized option is treated as the "primary resource". Everything else is 
        
              * treated as application arguments. 
        
              */ 
        
             override protected def handleUnknown(opt: String): Boolean = { 
        
               if (opt.startsWith("-")) { 
        
                 error(s"Unrecognized option '$opt'.") 
        
               } 
        
               primaryResource = 
        
                 if (!SparkSubmit.isShell(opt) && !SparkSubmit.isInternal(opt)) { 
        
                   Utils.resolveURI(opt).toString 
        
                 } else { 
        
                   opt 
        
                 } 
        
               isPython = SparkSubmit.isPython(opt) 
        
               isR = SparkSubmit.isR(opt) 
        
               false 
        
             }

Yes, when you do run-example, you just specify the class name(e.g. SparkPi), and no need to specify the app-jar. But in backend code, appResource should be auto-find-and-set as the examples' main app jar (e.g. ./examples/jars/spark-examples_2.12-3.0.0.jar), and then be added as an arg ( this arg is the so-called first unrecognized arg and will later be used as the primaryResource).

So actually, app-jar arg is always needed in backend.

The bug is, the original backend code forgot to add app-jar, and so the first appArg(e.g. you use SparkPi example and set 100 as its first arg) will be treated as app-jar, check the :: _jarUrl :: part at the following snippet:

spark/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala

Lines 74 to 89 in f556946

    
           case "launch" :: _master :: _jarUrl :: _mainClass :: tail => 
        
             cmd = "launch" 
        
             if (!ClientArguments.isValidJarUrl(_jarUrl)) { 
        
               // scalastyle:off println 
        
               println(s"Jar url '${_jarUrl}' is not in valid format.") 
        
               println(s"Must be a jar file path in URL format " + 
        
                 "(e.g. hdfs://host:port/XX.jar, file:///XX.jar)") 
        
               // scalastyle:on println 
        
               printUsageAndExit(-1) 
        
             } 
        
             jarUrl = _jarUrl 
        
             masters = Utils.parseStandaloneMasterUrls(_master) 
        
             mainClass = _mainClass 
        
             _driverOptions ++= tail

In the original code, spark-internal is added, so spark-internal is is treated as primaryResource(aka app-jar).

spark-internal is useless in this case, but useful in many other cases (covered in some unit tests), so I prefer not to omit it here, or omit it only when you do run-example.

P.S.
Standalone-cluster is a working mode.

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Line 297 in de44e9c

val isStandAloneCluster = clusterManager == STANDALONE && deployMode == CLUSTER

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Line 685 in de44e9c

if (args.isStandaloneCluster) {

srowen · 2020-09-07T15:28:33Z

Yes, I'm talking about how spark-submit works, not run-example. In this case it should specify --class, as the JAR is already present. Specifying a JAR is for when you want it to read Main-Class from the Manifest. It's worth seeing if that works, just not specifying it, as it should already set --class in this case.

KevinSmile · 2020-09-08T00:49:23Z

app-jar arg is always needed in spark-submit user command line, backend code won't add it for you;
while for run-example, the backend code should auto-add app-jar arg for you.

--jars option and app-jar arg are two different things.

Are you talking about another high-level design issue, not this [SPARK-32804] specific bug?

srowen · 2020-09-08T00:57:21Z

I think you're probably right about this, but then how does this work at all? the argument is bogus and ignored in most deployment scenarios? that would make sense.

KevinSmile · 2020-09-08T01:07:14Z

I think it only affects run-example in standalone-cluster mode.

As standalone-client mode and yarn-cluster/k8s-cluster/... mode have different logics about how to add args and later parse args when do the real launch:

spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Lines 295 to 306 in de44e9c

    
           val isYarnCluster = clusterManager == YARN && deployMode == CLUSTER 
        
           val isMesosCluster = clusterManager == MESOS && deployMode == CLUSTER 
        
           val isStandAloneCluster = clusterManager == STANDALONE && deployMode == CLUSTER 
        
           val isKubernetesCluster = clusterManager == KUBERNETES && deployMode == CLUSTER 
        
           val isKubernetesClient = clusterManager == KUBERNETES && deployMode == CLIENT 
        
           val isKubernetesClusterModeDriver = isKubernetesClient && 
        
             sparkConf.getBoolean("spark.kubernetes.submitInDriver", false) 
        
           if (!isMesosCluster && !isStandAloneCluster) { 
        
             // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files 
        
             // too for packages that include Python code 
        
             val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(

And for spark-submit in client mode, if --jars is provided(including your main jar), you will still get the right result, but with a log error.

./bin/spark-submit \
>   --class org.apache.spark.examples.SparkPi \
>   --jars ./examples/jars/spark-examples_2.12-3.0.0.jar,./examples/jars/scopt_2.12-3.7.1.jar \
>   --master spark://KevinMac.local:7077 \
>   --deploy-mode client \
>   100
20/09/08 08:52:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/09/08 08:52:28 WARN DependencyUtils: Local jar /Users/Kevin/Development/learnSpark/spark-3.0.0-bin-hadoop2.7/100 does not exist, skipping.
20/09/08 08:52:29 INFO SparkContext: Running Spark version 3.0.0
20/09/08 08:52:29 INFO ResourceUtils: ==============================================================
20/09/08 08:52:29 INFO ResourceUtils: Resources for spark.driver:

20/09/08 08:52:29 INFO ResourceUtils: ==============================================================
20/09/08 08:52:29 INFO SparkContext: Submitted application: Spark Pi
20/09/08 08:52:29 INFO SecurityManager: Changing view acls to: Kevin
20/09/08 08:52:29 INFO SecurityManager: Changing modify acls to: Kevin
20/09/08 08:52:29 INFO SecurityManager: Changing view acls groups to:
20/09/08 08:52:29 INFO SecurityManager: Changing modify acls groups to:
20/09/08 08:52:29 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(Kevin); groups with view permissions: Set(); users  with modify permissions: Set(Kevin); groups with modify permissions: Set()
20/09/08 08:52:29 INFO Utils: Successfully started service 'sparkDriver' on port 62182.
20/09/08 08:52:29 INFO SparkEnv: Registering MapOutputTracker
20/09/08 08:52:29 INFO SparkEnv: Registering BlockManagerMaster
20/09/08 08:52:29 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/09/08 08:52:29 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/09/08 08:52:29 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
20/09/08 08:52:29 INFO DiskBlockManager: Created local directory at /private/var/folders/dg/b9vzh3ls5d57qmmffjpvc8540000gn/T/blockmgr-8ba808a8-d7dc-4e69-901d-578282febf22
20/09/08 08:52:29 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
20/09/08 08:52:29 INFO SparkEnv: Registering OutputCommitCoordinator
20/09/08 08:52:30 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/09/08 08:52:30 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.0.100:4040
20/09/08 08:52:30 INFO SparkContext: Added JAR file:///Users/Kevin/Development/learnSpark/spark-3.0.0-bin-hadoop2.7/examples/jars/spark-examples_2.12-3.0.0.jar at spark://192.168.0.100:62182/jars/spark-examples_2.12-3.0.0.jar with timestamp 1599526350171
20/09/08 08:52:30 INFO SparkContext: Added JAR file:///Users/Kevin/Development/learnSpark/spark-3.0.0-bin-hadoop2.7/examples/jars/scopt_2.12-3.7.1.jar at spark://192.168.0.100:62182/jars/scopt_2.12-3.7.1.jar with timestamp 1599526350172
20/09/08 08:52:30 ERROR SparkContext: Failed to add file:/Users/Kevin/Development/learnSpark/spark-3.0.0-bin-hadoop2.7/100 to Spark environment
java.io.FileNotFoundException: Jar /Users/Kevin/Development/learnSpark/spark-3.0.0-bin-hadoop2.7/100 not found
	at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1827)
	at org.apache.spark.SparkContext.addJar(SparkContext.scala:1881)
	at org.apache.spark.SparkContext.$anonfun$new$11(SparkContext.scala:485)
	at org.apache.spark.SparkContext.$anonfun$new$11$adapted(SparkContext.scala:485)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:485)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2555)
	at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$1(SparkSession.scala:930)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
20/09/08 08:52:30 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://KevinMac.local:7077...
20/09/08 08:52:30 INFO TransportClientFactory: Successfully created connection to KevinMac.local/192.168.0.100:7077 after 70 ms (0 ms spent in bootstraps)
20/09/08 08:52:30 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20200908085230-0001
20/09/08 08:52:30 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20200908085230-0001/0 on worker-20200908082422-192.168.0.100-61280 (192.168.0.100:61280) with 4 core(s)
20/09/08 08:52:30 INFO StandaloneSchedulerBackend: Granted executor ID app-20200908085230-0001/0 on hostPort 192.168.0.100:61280 with 4 core(s), 1024.0 MiB RAM
20/09/08 08:52:30 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62184.
20/09/08 08:52:30 INFO NettyBlockTransferService: Server created on 192.168.0.100:62184
20/09/08 08:52:30 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/09/08 08:52:30 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200908085230-0001/0 is now RUNNING
20/09/08 08:52:30 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.100, 62184, None)
20/09/08 08:52:30 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.100:62184 with 366.3 MiB RAM, BlockManagerId(driver, 192.168.0.100, 62184, None)
20/09/08 08:52:30 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.100, 62184, None)
20/09/08 08:52:30 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.0.100, 62184, None)
20/09/08 08:52:31 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
20/09/08 08:52:32 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
20/09/08 08:52:32 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 2 output partitions
20/09/08 08:52:32 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
20/09/08 08:52:32 INFO DAGScheduler: Parents of final stage: List()
20/09/08 08:52:32 INFO DAGScheduler: Missing parents: List()
20/09/08 08:52:32 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
20/09/08 08:52:32 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KiB, free 366.3 MiB)
20/09/08 08:52:32 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1816.0 B, free 366.3 MiB)
20/09/08 08:52:32 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.100:62184 (size: 1816.0 B, free: 366.3 MiB)
20/09/08 08:52:32 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1200
20/09/08 08:52:32 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1))
20/09/08 08:52:32 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
20/09/08 08:52:34 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
20/09/08 08:52:35 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.0.100:62186) with ID 0
20/09/08 08:52:35 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.100:62188 with 366.3 MiB RAM, BlockManagerId(0, 192.168.0.100, 62188, None)
20/09/08 08:52:35 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 192.168.0.100, executor 0, partition 0, PROCESS_LOCAL, 7397 bytes)
20/09/08 08:52:35 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 192.168.0.100, executor 0, partition 1, PROCESS_LOCAL, 7397 bytes)
20/09/08 08:52:35 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.100:62188 (size: 1816.0 B, free: 366.3 MiB)
20/09/08 08:52:36 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1387 ms on 192.168.0.100 (executor 0) (1/2)
20/09/08 08:52:36 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 1340 ms on 192.168.0.100 (executor 0) (2/2)
20/09/08 08:52:36 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
20/09/08 08:52:36 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 4.205 s
20/09/08 08:52:36 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
20/09/08 08:52:36 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
20/09/08 08:52:36 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 4.375118 s
Pi is roughly 3.1478357391786957
20/09/08 08:52:36 INFO SparkUI: Stopped Spark web UI at http://192.168.0.100:4040
20/09/08 08:52:36 INFO StandaloneSchedulerBackend: Shutting down all executors
20/09/08 08:52:36 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
20/09/08 08:52:36 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/09/08 08:52:36 INFO MemoryStore: MemoryStore cleared
20/09/08 08:52:36 INFO BlockManager: BlockManager stopped
20/09/08 08:52:36 INFO BlockManagerMaster: BlockManagerMaster stopped
20/09/08 08:52:36 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/09/08 08:52:36 INFO SparkContext: Successfully stopped SparkContext
20/09/08 08:52:36 INFO ShutdownHookManager: Shutdown hook called
20/09/08 08:52:37 INFO ShutdownHookManager: Deleting directory /private/var/folders/dg/b9vzh3ls5d57qmmffjpvc8540000gn/T/spark-3c24377c-36fe-46d7-b948-3a6f8321451a
20/09/08 08:52:37 INFO ShutdownHookManager: Deleting directory /private/var/folders/dg/b9vzh3ls5d57qmmffjpvc8540000gn/T/spark-ea62154a-ece4-4631-adb5-cc971e3d2b1f

srowen · 2020-09-08T13:33:48Z

Jenkins test this please

SparkQA · 2020-09-08T17:20:38Z

Test build #128410 has finished for PR 29653 at commit 2c8c4bb.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2020-09-08T17:46:45Z

Jenkins retest this please

SparkQA · 2020-09-08T20:39:30Z

Test build #128419 has finished for PR 29653 at commit 2c8c4bb.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

KevinSmile · 2020-09-09T11:08:23Z

It seems that the Kafka environment of Jenkins-ci is unstable:

srowen · 2020-09-09T12:53:23Z

Yes it's unrelated. I can try again

srowen · 2020-09-09T12:53:30Z

Jenkins retest this please

SparkQA · 2020-09-09T16:08:51Z

Test build #128451 has finished for PR 29653 at commit 2c8c4bb.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

KevinSmile · 2020-09-10T05:22:38Z

It seems that BlockManagerDecommissionIntegrationSuite is unstable, as it passed in a previous ci but failed this time.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128419/testReport/org.apache.spark.storage/BlockManagerDecommissionIntegrationSuite/

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128451/testReport/org.apache.spark.storage/BlockManagerDecommissionIntegrationSuite/

Something like #29388 ? @HyukjinKwon Can you help let Jenkins retest this?

srowen · 2020-09-10T13:00:18Z

Yes these must be unrelated. Don't worry either the test will get fixed and/or it will happen to pass soon. I'll keep an eye on it.

srowen · 2020-09-11T13:06:35Z

Jenkins retest this please

SparkQA · 2020-09-11T16:22:54Z

Test build #128569 has finished for PR 29653 at commit 2c8c4bb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2020-09-12T21:12:36Z

Merged to master

KevinSmile · 2020-09-13T04:09:48Z

@srowen Thanks!

koertkuipers · 2020-09-16T02:17:07Z

i suspect this pullreq causes spark master to fail for me.
i run:
$ mvn clean test -fae

i see these failures:

[ERROR] Failures:                                                                                                                                             [609/9854]
[ERROR]   SparkSubmitCommandBuilderSuite.testExamplesRunnerWithMasterNoMainClass                                                                                        
Expected: (an instance of java.lang.IllegalArgumentException and exception with message a string containing "Missing example class name.")                              
     but: an instance of java.lang.IllegalArgumentException <java.lang.IllegalStateException: Failed to find examples' main app jar.> is a java.lang.IllegalStateExcept$
on                                                                                                                                                                      
Stacktrace was: java.lang.IllegalStateException: Failed to find examples' main app jar.                                                                                 
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.findExamplesAppJar(SparkSubmitCommandBuilder.java:412)                                                   
        at org.apache.spark.launcher.SparkSubmitCommandBuilder.<init>(SparkSubmitCommandBuilder.java:142)                                                               
        at org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.newCommandBuilder(SparkSubmitCommandBuilderSuite.java:423)                                          
        at org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.buildCommand(SparkSubmitCommandBuilderSuite.java:429)                                               
        at org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.testExamplesRunnerWithMasterNoMainClass(SparkSubmitCommandBuilderSuite.java:227)                    
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)                                                                                                  
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)                                                                                
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)                                                                        
        at java.lang.reflect.Method.invoke(Method.java:498)                                                                                                             
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)                                                                         
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)                                                                          
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)                                                                           
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)                                                                            
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)                                                                                  
        at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)                                                            
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)                                                                                                          
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)                                                                                                
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)                                                                            
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)                                                                            
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)                                                                                                  
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
        at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

ERROR] Errors: 
[ERROR]   SparkSubmitCommandBuilderSuite.testCliKillAndStatus:87->testCLIOpts:442->buildCommand:429->newCommandBuilder:423 » IllegalState
[ERROR]   SparkSubmitCommandBuilderSuite.testExamplesRunner:240->buildCommand:429->newCommandBuilder:423 » IllegalState
[ERROR]   SparkSubmitCommandBuilderSuite.testExamplesRunnerNoArg »  Unexpected exception...
[ERROR]   SparkSubmitCommandBuilderSuite.testExamplesRunnerNoMainClass:212->testCLIOpts:442->buildCommand:429->newCommandBuilder:423 » IllegalState
[ERROR]   SparkSubmitCommandBuilderSuite.testExamplesRunnerPrimaryResource:257->newCommandBuilder:423 » IllegalState

[ERROR] Tests run: 46, Failures: 1, Errors: 5, Skipped: 0

and

INFO] Reactor Summary for Spark Project Parent POM 3.1.0-SNAPSHOT:                                                                                                     
[INFO]                                                                                                                                                                  
[INFO] Spark Project Parent POM ........................... SUCCESS [  5.113 s]                                                                                         
[INFO] Spark Project Tags ................................. SUCCESS [ 12.398 s]                                                                                         
[INFO] Spark Project Sketch ............................... SUCCESS [ 35.952 s]                                                                                         
[INFO] Spark Project Local DB ............................. SUCCESS [  8.769 s]                                                                                         
[INFO] Spark Project Networking ........................... SUCCESS [01:01 min]                                                                                         
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  9.461 s]                                                                                         
[INFO] Spark Project Unsafe ............................... SUCCESS [ 19.411 s]                                                                                         
[INFO] Spark Project Launcher ............................. FAILURE [  5.735 s]                                                                                         
[INFO] Spark Project Core ................................. SKIPPED                                                                                                     
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 53.520 s]                                                                                         
[INFO] Spark Project GraphX ............................... SKIPPED                                                                                                     
[INFO] Spark Project Streaming ............................ SKIPPED                                                                                                     
[INFO] Spark Project Catalyst ............................. SKIPPED                                                                                                     
[INFO] Spark Project SQL .................................. SKIPPED                                                                                                     
[INFO] Spark Project ML Library ........................... SKIPPED                                                                                                     
[INFO] Spark Project Tools ................................ SUCCESS [ 11.154 s]                                                                                         
[INFO] Spark Project Hive ................................. SKIPPED                                                                                                     
[INFO] Spark Project REPL ................................. SKIPPED                                                                                                     
[INFO] Spark Project Assembly ............................. SKIPPED                                                                                                     
[INFO] Kafka 0.10+ Token Provider for Streaming ........... SKIPPED                                                                                                     
[INFO] Spark Integration for Kafka 0.10 ................... SKIPPED                                                                                                     
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SKIPPED                                                                                                     
[INFO] Spark Project Examples ............................. SKIPPED                                                                                                     
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SKIPPED                                                                                                     
[INFO] Spark Avro ......................................... SKIPPED

HyukjinKwon · 2020-09-16T03:03:48Z

launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java

+        return exampleJar;
+      }
+    }
+    throw new IllegalStateException("Failed to find examples' main app jar.");


@koertkuipers I think you can fix it like:

if (isTesting) { SparkLauncher.NO_RESOURCE } else { throw new IllegalStateException("Failed to find examples' main app jar."); }

Are you interested in fixing it as a followup pr via testing as you did?

HyukjinKwon · 2020-09-16T03:03:58Z

Yeah makes senses. In CI we build first so the jars will always be available. If we run the tests with compilation like you did, the jars might not be available.

KevinSmile · 2020-09-16T03:13:54Z

Oh Thanks!
My fault. Also need to delete the ut SparkSubmitCommandBuilderSuite.testExamplesRunnerPrimaryResource() as you cannot find the app-jar during test.

HyukjinKwon · 2020-09-16T03:20:03Z

@KevinSmile, can you make a followup PR with the reproducible steps @koertkuipers provided above?

KevinSmile · 2020-09-16T03:23:38Z

Yes, I'm on it. @HyukjinKwon @koertkuipers Thanks again!

followup fix at #29769

…test failure without jars ### What changes were proposed in this pull request? It's a followup of #29653. Tests in `SparkSubmitCommandBuilderSuite` may fail if you didn't build first and have jars before test, so if `isTesting` we should set a dummy `SparkLauncher.NO_RESOURCE`. ### Why are the changes needed? Fix tests failure. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? mvn clean test (test without jars built first). Closes #29769 from KevinSmile/bug-fix-master. Authored-by: KevinSmile <kevinwang013@hotmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

KevinSmile force-pushed the bug-fix-master branch from 6786920 to baed78f Compare September 5, 2020 14:38

HyukjinKwon changed the title ~~[SPARK-32804][Launcher] fix run-example command builder bug~~ [SPARK-32804][Launcher] Fix run-example command builder bug Sep 6, 2020

KevinSmile force-pushed the bug-fix-master branch from baed78f to 40df6e1 Compare September 6, 2020 03:46

[SPARK-32804][Launcher] fix run-example command builder bug

2c8c4bb

KevinSmile force-pushed the bug-fix-master branch from 40df6e1 to 2c8c4bb Compare September 7, 2020 01:56

srowen closed this in bbbd907 Sep 12, 2020

HyukjinKwon reviewed Sep 16, 2020

View reviewed changes

KevinSmile mentioned this pull request Sep 16, 2020

[SPARK-32804][Launcher][FOLLOWUP] Fix SparkSubmitCommandBuilderSuite test failure without jars #29769

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-32804][Launcher] Fix run-example command builder bug #29653

[SPARK-32804][Launcher] Fix run-example command builder bug #29653

KevinSmile commented Sep 5, 2020 •

edited

Loading

srowen commented Sep 6, 2020

KevinSmile commented Sep 6, 2020 •

edited

Loading

KevinSmile commented Sep 6, 2020

srowen commented Sep 6, 2020

KevinSmile commented Sep 7, 2020 •

edited

Loading

srowen commented Sep 7, 2020

KevinSmile commented Sep 8, 2020 •

edited

Loading

srowen commented Sep 8, 2020

KevinSmile commented Sep 8, 2020 •

edited

Loading

srowen commented Sep 8, 2020

SparkQA commented Sep 8, 2020

srowen commented Sep 8, 2020

SparkQA commented Sep 8, 2020

KevinSmile commented Sep 9, 2020

srowen commented Sep 9, 2020

srowen commented Sep 9, 2020

SparkQA commented Sep 9, 2020

KevinSmile commented Sep 10, 2020 •

edited

Loading

srowen commented Sep 10, 2020

srowen commented Sep 11, 2020

SparkQA commented Sep 11, 2020

srowen commented Sep 12, 2020

KevinSmile commented Sep 13, 2020

koertkuipers commented Sep 16, 2020

HyukjinKwon Sep 16, 2020

HyukjinKwon commented Sep 16, 2020

KevinSmile commented Sep 16, 2020

HyukjinKwon commented Sep 16, 2020

KevinSmile commented Sep 16, 2020 •

edited

Loading

	childArgs += "launch"
	childArgs += (args.master, args.primaryResource, args.mainClass)

	case "launch" :: _master :: _jarUrl :: _mainClass :: tail =>
	cmd = "launch"

	if (!ClientArguments.isValidJarUrl(_jarUrl)) {
	// scalastyle:off println
	println(s"Jar url '${_jarUrl}' is not in valid format.")
	println(s"Must be a jar file path in URL format " +
	"(e.g. hdfs://host:port/XX.jar, file:///XX.jar)")
	// scalastyle:on println
	printUsageAndExit(-1)
	}

	jarUrl = _jarUrl
	masters = Utils.parseStandaloneMasterUrls(_master)
	mainClass = _mainClass
	_driverOptions ++= tail

[SPARK-32804][Launcher] Fix run-example command builder bug #29653

[SPARK-32804][Launcher] Fix run-example command builder bug #29653

Conversation

KevinSmile commented Sep 5, 2020 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

srowen commented Sep 6, 2020

KevinSmile commented Sep 6, 2020 • edited Loading

KevinSmile commented Sep 6, 2020

srowen commented Sep 6, 2020

KevinSmile commented Sep 7, 2020 • edited Loading

srowen commented Sep 7, 2020

KevinSmile commented Sep 8, 2020 • edited Loading

srowen commented Sep 8, 2020

KevinSmile commented Sep 8, 2020 • edited Loading

srowen commented Sep 8, 2020

SparkQA commented Sep 8, 2020

srowen commented Sep 8, 2020

SparkQA commented Sep 8, 2020

KevinSmile commented Sep 9, 2020

srowen commented Sep 9, 2020

srowen commented Sep 9, 2020

SparkQA commented Sep 9, 2020

KevinSmile commented Sep 10, 2020 • edited Loading

srowen commented Sep 10, 2020

srowen commented Sep 11, 2020

SparkQA commented Sep 11, 2020

srowen commented Sep 12, 2020

KevinSmile commented Sep 13, 2020

koertkuipers commented Sep 16, 2020

HyukjinKwon Sep 16, 2020

Choose a reason for hiding this comment

HyukjinKwon commented Sep 16, 2020

KevinSmile commented Sep 16, 2020

HyukjinKwon commented Sep 16, 2020

KevinSmile commented Sep 16, 2020 • edited Loading

KevinSmile commented Sep 5, 2020 •

edited

Loading

KevinSmile commented Sep 6, 2020 •

edited

Loading

KevinSmile commented Sep 7, 2020 •

edited

Loading

KevinSmile commented Sep 8, 2020 •

edited

Loading

KevinSmile commented Sep 8, 2020 •

edited

Loading

KevinSmile commented Sep 10, 2020 •

edited

Loading

KevinSmile commented Sep 16, 2020 •

edited

Loading