-
Notifications
You must be signed in to change notification settings - Fork 118
Check for user jars/files existence before creating the driver pod. #86
Check for user jars/files existence before creating the driver pod. #86
Conversation
7c47ac6
to
9914821
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this, it indeed does make sense to push as much precondition checking as early as possible. I've left some comments.
@@ -661,6 +665,33 @@ private[spark] class Client( | |||
}).toMap | |||
}).getOrElse(Map.empty[String, String]) | |||
} | |||
|
|||
private def checkForUserJarsFilesExistence: Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the trend has been for these kinds of precondition methods to pass the appropriate things to inspect as parameters, as opposed to inspecting the instance variables directly. We also have been running the validation inside of run()
as opposed to in the constructor. See parseCustomLabels()
for example.
private def checkForUserJarsFilesExistence: Unit = { | ||
val conf = new Configuration | ||
|
||
def existsOrAbort(uri: URI): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather avoid nesting methods here. We can just inline this logic in the case statement of checkExistence
.
|
||
def existsOrAbort(uri: URI): Unit = { | ||
val path = new Path(uri) | ||
if (!path.getFileSystem(conf).isFile(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we inline this in the case
statement, it becomes clear that the scheme is of type file
. Then instead of using getFileSystem
we can just use an instance of java.io.File
instance from the path component of the URI. This is the pattern we follow elsewhere, see buildSubmissionRequest()
.
} | ||
} | ||
|
||
def checkExistence(maybePaths: Option[String]): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, avoid nested methods. We can make this be the outermost method, and then run()
can just call this three times, once for each of the appropriate fields to inspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support failing fast!
@@ -64,6 +68,7 @@ private[spark] class Client( | |||
private val uploadedJars = sparkConf.get(KUBERNETES_DRIVER_UPLOAD_JARS).filter(_.nonEmpty) | |||
private val uploadedFiles = sparkConf.get(KUBERNETES_DRIVER_UPLOAD_FILES).filter(_.nonEmpty) | |||
uploadedFiles.foreach(validateNoDuplicateUploadFileNames) | |||
checkForUserJarsFilesExistence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put parentheses around zero-param methods like checkForUserJarsFilesExistence()
-- I think this is Spark style
@aash anything else to add? Otherwise feel free to merge when the build passes. |
Thanks @lins05 ! |
) * Check for user jars/files existence before creating the driver pod. Close #85 * CR
) * Check for user jars/files existence before creating the driver pod. Close #85 * CR
…pache-spark-on-k8s#86) * Check for user jars/files existence before creating the driver pod. Close apache-spark-on-k8s#85 * CR
…pache-spark-on-k8s#86) * Check for user jars/files existence before creating the driver pod. Close apache-spark-on-k8s#85 * CR
Closes #85