Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-11818][REPL] Fix ExecutorClassLoader to lookup resources from … #9812

Closed
wants to merge 3 commits into from

Conversation

HeartSaVioR
Copy link
Contributor

…parent class loader

Without patch, two additional tests of ExecutorClassLoaderSuite fails.

  • "resource from parent"
  • "resources from parent"

Detailed explanation is here, https://issues.apache.org/jira/browse/SPARK-11818?focusedCommentId=15011202&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15011202

…parent class loader

* Without patch, some tests of ExecutorClassLoaderSuite fails
@tedyu
Copy link
Contributor

tedyu commented Nov 18, 2015

Jenkins, please test this

@@ -55,6 +57,14 @@ class ExecutorClassLoader(conf: SparkConf, classUri: String, parent: ClassLoader
}
}

override def getResource(name: String): URL = {
parentLoader.getResource(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holdenk maybe you have some thoughts on this?
Also summoning @vanzin

Doesn't this need to check userClassPathFirst?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen
It doesn't need to check userClassPathFirst since this implementation implies that REPL never provides resources dynamically so there's no need to lookup resource from ExecutorClassLoader itself.

Btw, could precondition be broken? I couldn't imagine REPL generating resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be surprised if the repl generated resources, but at the same time, what if someone tries to load the generated class file as a resource? It's an unusual but valid use case.

The code to support that is not complicated; it already exists in MutableURLClassLoader.scala (class ChildFirstURLClassLoader for, example). If you're up for it you could even create a helper trait or a utility method somewhere for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin
I didn't see use case you mentioned but it could make sense.

In order to achieve, we have to implement findResource() and findResources() for ExecutorClassLoader since ExecutorClassLoader cannot rely on superclass (ClassLoader) to find resource.
It is easy to provide resource URL which points to origin scheme (http, https, ftp, hdfs), but since I'm new to classloader, so I'm wondering it is safe to return URL from findResource() and findResources() which doesn't point to local file.

If it is not safe to return non local file as URL, what's recommended way to do?
I can only think about downloading files to local temp directory per every call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering it is safe to return URL from findResource() and findResources() which doesn't point to local file

That should be perfectly fine. That's how URLClassLoader works, after all. The only potential odd thing would be getResourceAsStream, which returns an InputStream, and my guess is if the JDK's URL class doesn't support the protocol, would cause an error unless you overrode it here.

So perhaps it's too much to worry about and we can just assume that no one will do that, and fix it if someone ever needs that feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin
To clarify about "feature", do you want me to change implementation of findResource() and findResources() for pointing origin scheme, and forget about potential odd? Or forget about finding resources from REPL uri and leave as this PR is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the PR as is should be fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin
OK, Thanks for clarification! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all seams reasonable to me (sorry for the slow reply I'm in Tokyo right now).

@vanzin
Copy link
Contributor

vanzin commented Nov 19, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Nov 19, 2015

Test build #46272 has finished for PR 9812 at commit f2dd621.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -54,6 +60,8 @@ class ExecutorClassLoaderSuite
url1 = "file://" + tempDir1
urls2 = List(tempDir2.toURI.toURL).toArray
childClassNames.foreach(TestUtils.createCompiledClass(_, tempDir1, "1"))
parentResourceNames.foreach(x =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .foreach { x =>

@HeartSaVioR
Copy link
Contributor Author

@vanzin Thanks for reviewing, I addressed your comment. Please take a look again.

@SparkQA
Copy link

SparkQA commented Nov 20, 2015

Test build #46410 has finished for PR 9812 at commit ac32668.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

Failed tests seems not related.

@vanzin
Copy link
Contributor

vanzin commented Nov 20, 2015

yeah, pyspark tests are super flaky lately. but it never hurts: retest this please

@SparkQA
Copy link

SparkQA commented Nov 20, 2015

Test build #46427 has finished for PR 9812 at commit ac32668.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Nov 24, 2015

@vanzin are you OK with this one?

@vanzin
Copy link
Contributor

vanzin commented Nov 24, 2015

Ah, yeah, fell through the cracks. Merging to master / 1.6.

asfgit pushed a commit that referenced this pull request Nov 24, 2015
…parent class loader

Without patch, two additional tests of ExecutorClassLoaderSuite fails.

- "resource from parent"
- "resources from parent"

Detailed explanation is here, https://issues.apache.org/jira/browse/SPARK-11818?focusedCommentId=15011202&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15011202

Author: Jungtaek Lim <kabhwan@gmail.com>

Closes #9812 from HeartSaVioR/SPARK-11818.

(cherry picked from commit be9dd15)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
@asfgit asfgit closed this in be9dd15 Nov 24, 2015
@HeartSaVioR
Copy link
Contributor Author

@srowen @vanzin @holdenk Thanks for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the SPARK-11818 branch December 20, 2015 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants