-
Notifications
You must be signed in to change notification settings - Fork 34
Spark dcos CLI breaks with multiple spaces #193
Conversation
@susanxhuynh @ArtRand pls review. |
// clean up any instances of shell-style escaped newlines: "arg1\\narg2" => "arg1 arg2" | ||
argsCleaned := strings.TrimSpace(backslashNewlinePattern.ReplaceAllLiteralString(argsStr, " ")) | ||
argsCleaned := strings.TrimSpace(backslashNewlinePattern.ReplaceAllLiteralString(argsCompacted, " ")) | ||
// HACK: spark-submit uses '--arg val' by convention, while kingpin only supports '--arg=val'. | ||
// translate the former into the latter for kingpin to parse. | ||
args := strings.Split(argsCleaned, " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to split argsCleaned
using a Regexp.Split
instead of strings.Split
:
package main
import (
"fmt"
"regexp"
)
func main() {
argString := "one two three four"
splitter := regexp.MustCompile(`\s+`)
for _, s := range splitter.Split(argString, -1) {
fmt.Println(s)
}
}
It would also be nice to add submit_builder_test.go
with some unit tests for the various cases here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thnx @elezar .Yes there are multiple options to fix this I guess. I agree we need tests since this is not the only issue found so far (probably will need to address it in a separate PR). @susanxhuynh are you running any tests for the cli internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, there are no additional tests for this kind of behaviour, and adding a simple unit tests here would add great value. If you don't get to it, I will see if I can add something this weekend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elezar ok I will check what tests I could add here. As I said though missing tests could be addressed with another PR... @susanxhuynh are there any cli tests for this part, should we add tests here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI is "tested" by the integration tests, they will fail fast if it's not working. Simple unit tests are fine, but IMO opinion we could just add to the CI tests given that the unit tests would be very simple (akin to getter/setter basically).
@susanxhuynh @elezar I added a basic unit test. @susanxhuynh Let me know if I need to trigger testing in more places. Is the setup ok for jenkins? |
cli/bin/test.sh
Outdated
|
||
cd "${BIN_DIR}/.." | ||
|
||
if [ -z "$GOPATH" -o -z "$(which go)" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@susanxhuynh This part is also found in build-go.sh and it is repeated here. I don't like bash include files but maybe we should add it in one. Let's verify that this works for the jenkins stuff first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto How about remove this file, and just add go test
to the build script, build-go.sh? (https://github.com/mesosphere/spark-build/blob/master/cli/bin/build-go.sh) That's how we have it in dcos-commons: we run unit tests inside the build script: https://github.com/mesosphere/dcos-commons/blob/master/tools/build_go_exe.sh#L88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would almost prefer that we don't add it to build-go.sh
so that you can build the CLI without invoking the tests (or commenting out lines then forgetting to uncomment them). FWIW I usually build the CLI then just copy it to my .dcos/
dir for testing the CLI.
@susanxhuynh could you assist with failures and how to integrate with the jenkins setup.
Test coverage for the submit code (submit_builder.go) is 23.9% which is misleading because although that part of the code was triggered not many conditions were checked in the single unit test anyway. |
Yeah, I'll check why the jenkins build failed, and figure out how to run the test in CI. |
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still debugging the Jenkins failure ... seems like an unrelated failure with Hadoop S3 library.
cli/dcos-spark/cli_test.go
Outdated
|
||
import "testing" | ||
|
||
// verify kingpin gets --arg=val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't seem to match. How about mentioning testing the multiple spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it also be hard to add a test for scopt
args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok will add a test sure.
cli/bin/test.sh
Outdated
|
||
cd "${BIN_DIR}/.." | ||
|
||
if [ -z "$GOPATH" -o -z "$(which go)" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto How about remove this file, and just add go test
to the build script, build-go.sh? (https://github.com/mesosphere/spark-build/blob/master/cli/bin/build-go.sh) That's how we have it in dcos-commons: we run unit tests inside the build script: https://github.com/mesosphere/dcos-commons/blob/master/tools/build_go_exe.sh#L88
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave it to a casual vote, CLI unit tests or add to the CI tests (by adding edge cases to the commands invoked). Either way, this change is good, just add the appropriate test.
cli/bin/test.sh
Outdated
|
||
cd "${BIN_DIR}/.." | ||
|
||
if [ -z "$GOPATH" -o -z "$(which go)" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would almost prefer that we don't add it to build-go.sh
so that you can build the CLI without invoking the tests (or commenting out lines then forgetting to uncomment them). FWIW I usually build the CLI then just copy it to my .dcos/
dir for testing the CLI.
cli/dcos-spark/cli_test.go
Outdated
|
||
import "testing" | ||
|
||
// verify kingpin gets --arg=val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it also be hard to add a test for scopt
args?
// clean up any instances of shell-style escaped newlines: "arg1\\narg2" => "arg1 arg2" | ||
argsCleaned := strings.TrimSpace(backslashNewlinePattern.ReplaceAllLiteralString(argsStr, " ")) | ||
argsCleaned := strings.TrimSpace(backslashNewlinePattern.ReplaceAllLiteralString(argsCompacted, " ")) | ||
// HACK: spark-submit uses '--arg val' by convention, while kingpin only supports '--arg=val'. | ||
// translate the former into the latter for kingpin to parse. | ||
args := strings.Split(argsCleaned, " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI is "tested" by the integration tests, they will fail fast if it's not working. Simple unit tests are fine, but IMO opinion we could just add to the CI tests given that the unit tests would be very simple (akin to getter/setter basically).
@susanxhuynh I updated the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cli/bin/build-go.sh
Outdated
@@ -57,3 +57,5 @@ case "$OSTYPE" in | |||
linux*) strip "${EXE_NAME}${SUFFIX}" | |||
esac | |||
print_file "${EXE_NAME}${SUFFIX}" | |||
|
|||
go test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto Could you add this earlier, like around L. 37? So the logic would be: if the unit tests do not pass, we do not bother building the (Windows, OSX, Linux) executables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I thought you build first and then test, sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@susanxhuynh I dont need rebasing I think, I touched different files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skonto My recent fix is needed to get one of the CI tests to pass (something changed in hdfs and I had to add a workaround). So, I think you have to include that fix in order for the Jenkins tests to pass for this PR.
@skonto Only thing missing is rebasing to pull in my CI fix. Otherwise, CI test ("test:shakedown") will not pass in this PR. |
@susanxhuynh done |
Spaces between arguments break the spark cli:
https://github.com/mesosphere/spark-build/blob/master/cli/dcos-spark/submit_builder.go#L302
This will fail when the job is run:
Error is not informative on mesos:
I1016 10:31:50.852361 7563 fetcher.cpp:285] Fetching directly into the sandbox directory
I1016 10:31:50.852404 7563 fetcher.cpp:222] Fetching URI ''
E1016 10:31:50.852429 7563 fetcher.cpp:579] EXIT with status 1: Failed to fetch '': A relative path was passed for the resource but the Mesos framework home was not specified. Please either provide this config option or avoid using a relative path
There are two spaces between those args: --conf spark.streaming.fileStream.minRememberDuration=1200s --driver-cores 1 --conf. I am using underscore to show that.
Another case is when we have:
--conf spark.app.name=kerberosStreaming
Two spaces again.
This will make dcos cli fail immediately:
2017/10/16 13:33:40 Error when parsing --submit-args: expected KEY=VALUE got ''