-
Notifications
You must be signed in to change notification settings - Fork 34
Update Spark CLI for extraJavaOptions
and Fix SubmitBuilder tests
#370
Conversation
Changes are from the dcos-sdk-spark-service repo.
Delete cli_test.go and port over this PR's tests so that we are in a good state.
extraJavaOptions
and Fix SubmitBuilder tests
@samvantran Besides the unit test, did you test it by running a Spark job? |
@@ -43,12 +43,7 @@ type SparkCommand struct { | |||
} | |||
|
|||
func (cmd *SparkCommand) runSubmit(a *kingpin.Application, e *kingpin.ParseElement, c *kingpin.ParseContext) error { | |||
marathonConfig, err := fetchMarathonConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW it may be necessary to make dispatcher additions in order for this to be removable? I don't remember for sure.
Effectively the dispatcher could grab these values from its environment and include them in the task automatically, instead of requiring the CLI to pass them in from the outside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up again @samvantran.
I have given it a quick look as suggested and have some points to note:
- Break the "Fix SubmitBuilder" tests into a separate PR instead?
- Do the unit tests pass?:
# _/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark
./main.go:50: cannot use ([]byte)(jsonPayload) (type []byte) as type string in argument to client.HTTPServicePostJSON
./main.go:188: cannot use jsonPayload (type []byte) as type string in argument to client.CreateHTTPRawRequest
FAIL _/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark [build failed]
- Run
go fmt
(and set up your editor to do this on save):
$ gofmt -l $(git diff master --name-only | grep '\.go$')
cli/dcos-spark/main.go
cli/dcos-spark/security.go
cli/dcos-spark/submit_builder.go
cli/dcos-spark/submit_builder_test.go
In addition, the code for parsing the args is getting a little difficult to follow and this makes reviewing and debugging failures hard.
I would suggest:
- Breaking out as many of the pieces as possible to smaller functions.
- Investigate using the argument parsing already provided by Kingpin to handle these arguments. Handling quoted and multi-line arguments is not a problem unique to us.
- Ensure that the unit tests pass locally (Including examples that contain escaped strings).
- Ensure that at least the modified files are
go fmt
compliant (Note: this could be considered a nit in other languages, but it really is the standard in the Go community). - Consider more error conditions and how to alert the user of these.
Note, I could be missing the point here, so please let me know if I'm assuming too general a behaviour.
Makefile
Outdated
@@ -23,6 +23,8 @@ HADOOP_VERSION ?= $(shell jq ".default_spark_dist.hadoop_version" "$(ROOT_DIR)/m | |||
SPARK_DIR ?= $(ROOT_DIR)/spark | |||
$(SPARK_DIR): | |||
git clone $(SPARK_REPO_URL) $(SPARK_DIR) | |||
cd $(SPARK_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use something like git clone --single-branch -b $(SPARK_BRANCH) $(SPARK_REPO_URL) $(SPARK_DIR)
here and make the branch that we are checking out configurable if it needs to be.
The line following this one is going to cause CI to fail if this gets merged and the source branch gets deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh sorry this was an accident! Should not have been there.
i += 2 | ||
} else if strings.HasSuffix(arg, "'") { // attach the final arg to the string of args without the quote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about escaped quotes? \'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added the test:
java_options = []string{"-Djava.thirdConfig=\\'thirdSetting\\'"}
inputArgs = "--driver-java-option='-Djava.thirdConfig=\\'thirdSetting\\'' --conf spark.cores.max=8"
suite.testLongArgInternal(inputArgs, java_options)
and get the following stacktrace:
panic: runtime error: index out of range [recovered]
panic: runtime error: index out of range
goroutine 52 [running]:
testing.tRunner.func1(0xc4200780d0)
/usr/local/go/src/testing/testing.go:622 +0x29d
panic(0x1423780, 0x169ff80)
/usr/local/go/src/runtime/panic.go:489 +0x2cf
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.cleanUpSubmitArgs(0x14a744a, 0x53, 0xc4204a6b28, 0x1, 0x1, 0xc420055958, 0x10b70be, 0x16a0710, 0x0, 0x0, ...)
/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder.go:367 +0x3d8
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.(*CliTestSuite).testLongArgInternal(0xc420195260, 0x14a744a, 0x53, 0xc420055a78, 0x1, 0x1)
/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder_test.go:80 +0x7e
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.(*CliTestSuite).TestStringLongArgs(0xc420195260)
/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder_test.go:113 +0x333
reflect.Value.call(0xc4204b4720, 0xc4204a6260, 0x13, 0x148fb5d, 0x4, 0xc420045f80, 0x1, 0x1, 0xc420032ec0, 0x148e620, ...)
/usr/local/go/src/reflect/value.go:434 +0x91f
reflect.Value.Call(0xc4204b4720, 0xc4204a6260, 0x13, 0xc420032f80, 0x1, 0x1, 0x13d46c6, 0x12, 0x0)
/usr/local/go/src/reflect/value.go:302 +0xa4
github.com/stretchr/testify/suite.Run.func2(0xc4200780d0)
/Users/elezar/go/src/github.com/stretchr/testify/suite/suite.go:102 +0x25f
testing.tRunner(0xc4200780d0, 0xc420380e00)
/usr/local/go/src/testing/testing.go:657 +0x96
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:697 +0x2ca
--driver-java-option='-Djava.thirdConfig=\'thirdSetting\'' --conf spark.cores.max=8exit status 2
FAIL _/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark 0.029s
Error: Tests failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the test:
inputArgs := "--driver-java-option='-Djava.thirdConfig=assetting --conf spark.cores.max=8"
java_options := []string{"-Djava.thirdConfig=thirdSetting"}
suite.testLongArgInternal(inputArgs, java_options)
Yields:
--- FAIL: TestCliTestSuite (0.00s)
--- FAIL: TestCliTestSuite/TestIncorrectStrings (0.00s)
submit_builder_test.go:87: Expected to find -Djava.thirdConfig=thirdSetting at index 0. Found --driver-java-option=-Djava.thirdConfig=assetting instead
I would expect a relevant error code instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following test case:
inputArgs := "--driver-java-option='-Djava.thirdConfig=\"a setting with a space\"' --conf spark.cores.max=8"
javaOptions := []string{"-Djava.thirdConfig=\"a setting with a space\""}
suite.testLongArgInternal(inputArgs, javaOptions)
Yields:
--- FAIL: TestCliTestSuite (0.00s)
--- FAIL: TestCliTestSuite/TestIncorrectStrings (0.00s)
submit_builder_test.go:82: Failed to parse --driver-java-option='-Djava.thirdConfig="a setting with a space"' --conf spark.cores.max=8, should have 2 args, got 5
submit_builder_test.go:87: Expected to find -Djava.thirdConfig="a setting with a space" at index 0. Found --driver-java-option=-Djava.thirdConfig="a instead
I would expect the spaces in the argument to be maintained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option here is misspelled ("--driver-java-option"), missing an "s" at the end. So, this may be a different issue.
// other named configuration | ||
// e.g.: next = spark.driver.extraJavaOptions='-Djava.something=somethingelse | ||
// arg = --conf | ||
arg = strings.TrimPrefix(arg, "'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow why we need to trim the prefix here and then check why next
also has the prefix.
@elezar I'm no go expert so this PR may actually be beyond me right now. I made an assumption that the approved PR #207 was fine as is and I was just combining it with some of Nick's changes when he migrated Spark to SDK to fix the CLI. fwiw, when I ran
But wrt to @susanxhuynh's question about this actually working when submitting spark jobs, I found it does not. I'm seeing the 2nd, 3rd, n extraJavaOption be passed as application arguments or the URI fetcher failing. So in short:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing to check: if the CLI unit tests fail, will the CI job fail?
val = newSparkVal("executor-java-options", "spark.executor.extraJavaOptions", "Extra Java options to pass to the executors.") | ||
val.flag(submit).StringVar(&val.s) | ||
args.stringVals = append(args.stringVals, val) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we need to add this new option. The user can specify this with "--conf spark.executor.extraJavaOptions".
i += 2 | ||
} else if strings.HasSuffix(arg, "'") { // attach the final arg to the string of args without the quote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option here is misspelled ("--driver-java-option"), missing an "s" at the end. So, this may be a different issue.
// join this arg to the next arg if...: | ||
// 1. we're not at the last arg in the array | ||
// 2. we start with "--" | ||
// 3. we don't already contain "=" (already joined) | ||
// 4. we aren't a boolean value (no val to join) | ||
|
||
|
||
// if this is a configuration flag like --conf or --driver-driver-options that doesn't have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "--driver-driver-options" should be "--driver-java-options"
} | ||
// already joined or at the end, pass through | ||
argsEquals = append(argsEquals, cleanedArg) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can document somewhere that we expect single quotes to be used when a value has spaces in it.
} else { | ||
args.properties["spark.mesos.executor.docker.image"] = cmd.submitDockerImage | ||
imageSource = "flag: --docker-image" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing this section will change the existing behavior: if a special Docker image was provided when installing the Dispatcher, that same image will also be used for Drivers and/or Executors (see the logic below). Unless we make some changes in the Dispatcher.
@@ -71,6 +75,39 @@ func (suite *CliTestSuite) TestScoptAppArgs() { | |||
} | |||
} | |||
|
|||
func (suite *CliTestSuite) testLongArgInternal(inputArgs string, expectedArgs []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "Helper" is a better name for this function rather than "Internal".
suite.testLongArgInternal(inputArgs, java_options) | ||
|
||
java_options = append(java_options, "-Djava.thirdConfig=thirdSetting") | ||
inputArgs = "--driver-java-option='-Djava.firstConfig=firstSetting -Djava.secondConfig=secondSetting -Djava.thirdConfig=thirdSetting' --conf spark.cores.max=8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in "--driver-java-option" (missing an "s")
951535a
to
e4c0a42
Compare
Closing this PR to separate fixing tests and adding functionality |
Resolves DCOS-37965 and DCOS-33686
Readdresses PR #207
This PR fixes CLI tests and allows Spark CLI to accept multiple extraJavaOptions when submitting a Spark job. It incorporates changes Nick made when he ported Spark onto the SDK as well as old commits from Art for long strings of Spark arguments.