-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21403][Mesos] fix --packages for mesos #18587
Conversation
printErrorAndExit("Cluster deploy mode is not applicable to Spark Thrift server.") | ||
case _ => | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for failing faster no need to resolve dependencies if we are going to fail.
Test build #79462 has finished for PR 18587 at commit
|
I'm 99% sure there's nothing to do for YARN. This line takes care of it:
YARN cluster mode will distribute all jars in As for the change, it seems to work because the Mesos backend starts the driver using If that's the case it seems fine, although it kinda loses the ability to use the ivy cache on the machine launching the job... Also, I'd be more comfortable if someone more familiar with the Mesos backend could take a look. Not sure who that person is. |
Correct... and this is my understanding: spark/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala Line 365 in 1cad31f
The distributed cache manager does the work by utilizing the distributed Hadoop cache for jars used at the executor side. https://github.com/apache/spark/blob/ab9872db1f9c0f289541ec5756d1a142d85545ce/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManager.scala
That is the idea. The second time the submit is called in order to launch the driver, the submit is done using client mode which is the default. The call for launching the driver in client mode is here: Lines 411 to 442 in 8da3f70
That's is correct. If you launch jobs from a machine within the same network cache makes sense. Otherwise it is just copying over the internet. Now in restarts it is also important when you re-launch something to access it from the cache. In spark there is no functionality for the mesos code to exploit any type of cache from what I see. I would prefer a unified cluster layer for certain things like in the case of other frameworks: https://issues.apache.org/jira/browse/FLINK-6177 For now, I see this PR as a first option to fix things because customers want it. As an iteration it would make more sense to have distributed cache support for the mesos branch as well. As for the standalone case I am no sure right now, probably the same approach with the cache would apply. |
Sounds good. Merging to master. |
thnx @vanzin I am working on some code for the standalone case. |
Fixes --packages flag for mesos in cluster mode. Probably I will handle standalone and Yarn in another commit, I need to investigate those cases as they are different. Tested with a community 1.9 dc/os cluster. packages were successfully resolved in cluster mode within a container. andrewor14 susanxhuynh ArtRand srowen pls review. Author: Stavros Kontopoulos <st.kontopoulos@gmail.com> Closes apache#18587 from skonto/fix_packages_mesos_cluster.
What changes were proposed in this pull request?
Fixes --packages flag for mesos in cluster mode. Probably I will handle standalone and Yarn in another commit, I need to investigate those cases as they are different.
How was this patch tested?
Tested with a community 1.9 dc/os cluster. packages were successfully resolved in cluster mode within a container.
@andrewor14 @susanxhuynh @ArtRand @srowen pls review.