Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18646][REPL] Set parent classloader as null for ExecutorClassLoader #17074

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,15 @@ import org.apache.spark.util.{ParentClassLoader, Utils}
* Allows the user to specify if user class path should be first.
* This class loader delegates getting/finding resources to parent loader,
* which makes sense until REPL never provide resource dynamically.
* This class does not set parent classloader since this class loader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about rephrase to:

[[ClassLoader]] will preferentially load class from parent. Only when parent is null or the load failed, that it will call the overridden `findClass` function. To avoid the potential issue caused by loading class using inappropriate class loader, we should set the parent of ClassLoader to null, so that we can fully control which class loader is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a empty line before this statement.

* has higher precedence over its parent class loader.
*/
class ExecutorClassLoader(
conf: SparkConf,
env: SparkEnv,
classUri: String,
parent: ClassLoader,
userClassPathFirst: Boolean) extends ClassLoader with Logging {
userClassPathFirst: Boolean) extends ClassLoader(null) with Logging {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add comment over this to explain why we are setting a null value to parent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to the description of this class

val uri = new URI(classUri)
val directory = uri.getPath

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ import java.nio.channels.{FileChannel, ReadableByteChannel}
import java.nio.charset.StandardCharsets
import java.nio.file.{Paths, StandardOpenOption}
import java.util
import java.util.Collections
import javax.tools.{JavaFileObject, SimpleJavaFileObject, ToolProvider}

import scala.io.Source
import scala.language.implicitConversions
Expand Down Expand Up @@ -77,6 +79,50 @@ class ExecutorClassLoaderSuite
}
}

test("child over system classloader") {
// JavaFileObject for scala.Option class
val scalaOptionFile = new SimpleJavaFileObject(
URI.create(s"string:///scala/Option.java"),
JavaFileObject.Kind.SOURCE) {

override def getCharContent(ignoreEncodingErrors: Boolean): CharSequence = {
"package scala; class Option {}"
}
}
// compile fake scala.Option class
ToolProvider
.getSystemJavaCompiler
.getTask(null, null, null, null, null, Collections.singletonList(scalaOptionFile)).call()

// create 'scala' dir in tempDir1
val scalaDir = new File(tempDir1, "scala")
assert(scalaDir.mkdir(), s"Failed to create 'scala' directory in $tempDir1")

// move the generated class into scala dir
val filename = "Option.class"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use createCompiledClass() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because createCompiledClass doesn't handle package name, it needs to put the generated class under 'scala' package and also it needs to move the generated class to 'scala' folder.

Since createCompiledClass is widely used, I didn't want to touch it just for this very unique situation.

val result = new File(filename)
assert(result.exists(), "Compiled file not found: " + result.getAbsolutePath)

val out = new File(scalaDir, filename)
Files.move(result, out)
assert(out.exists(), "Destination file not moved: " + out.getAbsolutePath)

// construct class loader tree
val parentLoader = new URLClassLoader(urls2, null)
val classLoader = new ExecutorClassLoader(
new SparkConf(), null, url1, parentLoader, true)

// load 'scala.Option', using ClassforName to do the exact same behavior as
// what JavaDeserializationStream does

// scalastyle:off classforname
val optionClass = Class.forName("scala.Option", false, classLoader)
// scalastyle:on classforname

assert(optionClass.getClassLoader == classLoader,
"scala.Option didn't come from ExecutorClassLoader")
}

test("child first") {
val parentLoader = new URLClassLoader(urls2, null)
val classLoader = new ExecutorClassLoader(new SparkConf(), null, url1, parentLoader, true)
Expand Down