Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Update Spark CLI for extraJavaOptions and Fix SubmitBuilder tests #370

Closed
wants to merge 8 commits into from

Conversation

samvantran
Copy link
Contributor

@samvantran samvantran commented Jun 27, 2018

Resolves DCOS-37965 and DCOS-33686
Readdresses PR #207

This PR fixes CLI tests and allows Spark CLI to accept multiple extraJavaOptions when submitting a Spark job. It incorporates changes Nick made when he ported Spark onto the SDK as well as old commits from Art for long strings of Spark arguments.

@samvantran samvantran added the wip label Jun 27, 2018
@samvantran samvantran requested review from susanxhuynh, nickbp and elezar and removed request for susanxhuynh June 28, 2018 19:58
@samvantran samvantran changed the title [WIP] Update Spark CLI for spark.driver.extraJavaOptions Update Spark CLI for spark.driver.extraJavaOptions and Fix SubmitBuilder tests Jun 28, 2018
@samvantran samvantran changed the title Update Spark CLI for spark.driver.extraJavaOptions and Fix SubmitBuilder tests Update Spark CLI for extraJavaOptions and Fix SubmitBuilder tests Jun 28, 2018
@susanxhuynh
Copy link
Contributor

@samvantran Besides the unit test, did you test it by running a Spark job?

@@ -43,12 +43,7 @@ type SparkCommand struct {
}

func (cmd *SparkCommand) runSubmit(a *kingpin.Application, e *kingpin.ParseElement, c *kingpin.ParseContext) error {
marathonConfig, err := fetchMarathonConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it may be necessary to make dispatcher additions in order for this to be removable? I don't remember for sure.

Effectively the dispatcher could grab these values from its environment and include them in the task automatically, instead of requiring the CLI to pass them in from the outside.

Copy link
Contributor

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up again @samvantran.

I have given it a quick look as suggested and have some points to note:

  • Break the "Fix SubmitBuilder" tests into a separate PR instead?
  • Do the unit tests pass?:
# _/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark
./main.go:50: cannot use ([]byte)(jsonPayload) (type []byte) as type string in argument to client.HTTPServicePostJSON
./main.go:188: cannot use jsonPayload (type []byte) as type string in argument to client.CreateHTTPRawRequest
FAIL	_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark [build failed]
  • Run go fmt (and set up your editor to do this on save):
$ gofmt -l $(git diff master --name-only | grep '\.go$')
cli/dcos-spark/main.go
cli/dcos-spark/security.go
cli/dcos-spark/submit_builder.go
cli/dcos-spark/submit_builder_test.go

In addition, the code for parsing the args is getting a little difficult to follow and this makes reviewing and debugging failures hard.

I would suggest:

  • Breaking out as many of the pieces as possible to smaller functions.
  • Investigate using the argument parsing already provided by Kingpin to handle these arguments. Handling quoted and multi-line arguments is not a problem unique to us.
  • Ensure that the unit tests pass locally (Including examples that contain escaped strings).
  • Ensure that at least the modified files are go fmt compliant (Note: this could be considered a nit in other languages, but it really is the standard in the Go community).
  • Consider more error conditions and how to alert the user of these.

Note, I could be missing the point here, so please let me know if I'm assuming too general a behaviour.

Makefile Outdated
@@ -23,6 +23,8 @@ HADOOP_VERSION ?= $(shell jq ".default_spark_dist.hadoop_version" "$(ROOT_DIR)/m
SPARK_DIR ?= $(ROOT_DIR)/spark
$(SPARK_DIR):
git clone $(SPARK_REPO_URL) $(SPARK_DIR)
cd $(SPARK_DIR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use something like git clone --single-branch -b $(SPARK_BRANCH) $(SPARK_REPO_URL) $(SPARK_DIR) here and make the branch that we are checking out configurable if it needs to be.

The line following this one is going to cause CI to fail if this gets merged and the source branch gets deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh sorry this was an accident! Should not have been there.

i += 2
} else if strings.HasSuffix(arg, "'") { // attach the final arg to the string of args without the quote
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about escaped quotes? \'?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added the test:

	   java_options = []string{"-Djava.thirdConfig=\\'thirdSetting\\'"}
	   inputArgs = "--driver-java-option='-Djava.thirdConfig=\\'thirdSetting\\'' --conf spark.cores.max=8"
	   suite.testLongArgInternal(inputArgs, java_options)

and get the following stacktrace:

panic: runtime error: index out of range [recovered]
	panic: runtime error: index out of range

goroutine 52 [running]:
testing.tRunner.func1(0xc4200780d0)
	/usr/local/go/src/testing/testing.go:622 +0x29d
panic(0x1423780, 0x169ff80)
	/usr/local/go/src/runtime/panic.go:489 +0x2cf
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.cleanUpSubmitArgs(0x14a744a, 0x53, 0xc4204a6b28, 0x1, 0x1, 0xc420055958, 0x10b70be, 0x16a0710, 0x0, 0x0, ...)
	/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder.go:367 +0x3d8
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.(*CliTestSuite).testLongArgInternal(0xc420195260, 0x14a744a, 0x53, 0xc420055a78, 0x1, 0x1)
	/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder_test.go:80 +0x7e
_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark.(*CliTestSuite).TestStringLongArgs(0xc420195260)
	/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark/submit_builder_test.go:113 +0x333
reflect.Value.call(0xc4204b4720, 0xc4204a6260, 0x13, 0x148fb5d, 0x4, 0xc420045f80, 0x1, 0x1, 0xc420032ec0, 0x148e620, ...)
	/usr/local/go/src/reflect/value.go:434 +0x91f
reflect.Value.Call(0xc4204b4720, 0xc4204a6260, 0x13, 0xc420032f80, 0x1, 0x1, 0x13d46c6, 0x12, 0x0)
	/usr/local/go/src/reflect/value.go:302 +0xa4
github.com/stretchr/testify/suite.Run.func2(0xc4200780d0)
	/Users/elezar/go/src/github.com/stretchr/testify/suite/suite.go:102 +0x25f
testing.tRunner(0xc4200780d0, 0xc420380e00)
	/usr/local/go/src/testing/testing.go:657 +0x96
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:697 +0x2ca
--driver-java-option='-Djava.thirdConfig=\'thirdSetting\'' --conf spark.cores.max=8exit status 2
FAIL	_/Users/elezar/src/mesosphere/spark-build/cli/dcos-spark	0.029s
Error: Tests failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the test:

inputArgs := "--driver-java-option='-Djava.thirdConfig=assetting --conf spark.cores.max=8"
java_options := []string{"-Djava.thirdConfig=thirdSetting"}
suite.testLongArgInternal(inputArgs, java_options)

Yields:

--- FAIL: TestCliTestSuite (0.00s)
    --- FAIL: TestCliTestSuite/TestIncorrectStrings (0.00s)
    	submit_builder_test.go:87: Expected to find -Djava.thirdConfig=thirdSetting at index 0. Found --driver-java-option=-Djava.thirdConfig=assetting instead

I would expect a relevant error code instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following test case:

inputArgs := "--driver-java-option='-Djava.thirdConfig=\"a setting with a space\"' --conf spark.cores.max=8"
javaOptions := []string{"-Djava.thirdConfig=\"a setting with a space\""}
suite.testLongArgInternal(inputArgs, javaOptions)

Yields:

--- FAIL: TestCliTestSuite (0.00s)
    --- FAIL: TestCliTestSuite/TestIncorrectStrings (0.00s)
    	submit_builder_test.go:82: Failed to parse --driver-java-option='-Djava.thirdConfig="a setting with a space"' --conf spark.cores.max=8, should have 2 args, got 5
    	submit_builder_test.go:87: Expected to find -Djava.thirdConfig="a setting with a space" at index 0. Found --driver-java-option=-Djava.thirdConfig="a instead

I would expect the spaces in the argument to be maintained.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option here is misspelled ("--driver-java-option"), missing an "s" at the end. So, this may be a different issue.

// other named configuration
// e.g.: next = spark.driver.extraJavaOptions='-Djava.something=somethingelse
// arg = --conf
arg = strings.TrimPrefix(arg, "'")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow why we need to trim the prefix here and then check why next also has the prefix.

@samvantran
Copy link
Contributor Author

@elezar I'm no go expert so this PR may actually be beyond me right now. I made an assumption that the approved PR #207 was fine as is and I was just combining it with some of Nick's changes when he migrated Spark to SDK to fix the CLI.

fwiw, when I ran go test I did see a pass

╭─  ~/go/src/github.com/mesosphere/spark-build/cli/dcos-spark                                                                                                                                           27.89    master   3   09:32:48
╰ go test -v
=== RUN   TestCliTestSuite
=== RUN   TestCliTestSuite/TestCleanUpSubmitArgs
=== RUN   TestCliTestSuite/TestPayloadCustomImageNoExecutor
=== RUN   TestCliTestSuite/TestPayloadSimple
Enabling forcePullImage by default. To disable this, set spark.mesos.executor.docker.forcePullImage=false
=== RUN   TestCliTestSuite/TestPayloadWithSecret
Enabling forcePullImage by default. To disable this, set spark.mesos.executor.docker.forcePullImage=false
Using Kerberos principal 'client@local'
Enabling forcePullImage by default. To disable this, set spark.mesos.executor.docker.forcePullImage=false
Using Kerberos principal 'client@local'
=== RUN   TestCliTestSuite/TestSaslSecret
Enabling forcePullImage by default. To disable this, set spark.mesos.executor.docker.forcePullImage=false
=== RUN   TestCliTestSuite/TestScoptAppArgs
=== RUN   TestCliTestSuite/TestStringLongArgs
--- PASS: TestCliTestSuite (0.00s)
    --- PASS: TestCliTestSuite/TestCleanUpSubmitArgs (0.00s)
    --- PASS: TestCliTestSuite/TestPayloadCustomImageNoExecutor (0.00s)
    --- PASS: TestCliTestSuite/TestPayloadSimple (0.00s)
    --- PASS: TestCliTestSuite/TestPayloadWithSecret (0.00s)
    --- PASS: TestCliTestSuite/TestSaslSecret (0.00s)
    --- PASS: TestCliTestSuite/TestScoptAppArgs (0.00s)
    --- PASS: TestCliTestSuite/TestStringLongArgs (0.00s)
PASS
ok  	github.com/mesosphere/spark-build/cli/dcos-spark	0.023s

But wrt to @susanxhuynh's question about this actually working when submitting spark jobs, I found it does not. I'm seeing the 2nd, 3rd, n extraJavaOption be passed as application arguments or the URI fetcher failing.

So in short:

Copy link
Contributor

@susanxhuynh susanxhuynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing to check: if the CLI unit tests fail, will the CI job fail?

val = newSparkVal("executor-java-options", "spark.executor.extraJavaOptions", "Extra Java options to pass to the executors.")
val.flag(submit).StringVar(&val.s)
args.stringVals = append(args.stringVals, val)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need to add this new option. The user can specify this with "--conf spark.executor.extraJavaOptions".

i += 2
} else if strings.HasSuffix(arg, "'") { // attach the final arg to the string of args without the quote
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option here is misspelled ("--driver-java-option"), missing an "s" at the end. So, this may be a different issue.

// join this arg to the next arg if...:
// 1. we're not at the last arg in the array
// 2. we start with "--"
// 3. we don't already contain "=" (already joined)
// 4. we aren't a boolean value (no val to join)


// if this is a configuration flag like --conf or --driver-driver-options that doesn't have a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "--driver-driver-options" should be "--driver-java-options"

}
// already joined or at the end, pass through
argsEquals = append(argsEquals, cleanedArg)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can document somewhere that we expect single quotes to be used when a value has spaces in it.

} else {
args.properties["spark.mesos.executor.docker.image"] = cmd.submitDockerImage
imageSource = "flag: --docker-image"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think removing this section will change the existing behavior: if a special Docker image was provided when installing the Dispatcher, that same image will also be used for Drivers and/or Executors (see the logic below). Unless we make some changes in the Dispatcher.

@@ -71,6 +75,39 @@ func (suite *CliTestSuite) TestScoptAppArgs() {
}
}

func (suite *CliTestSuite) testLongArgInternal(inputArgs string, expectedArgs []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Helper" is a better name for this function rather than "Internal".

suite.testLongArgInternal(inputArgs, java_options)

java_options = append(java_options, "-Djava.thirdConfig=thirdSetting")
inputArgs = "--driver-java-option='-Djava.firstConfig=firstSetting -Djava.secondConfig=secondSetting -Djava.thirdConfig=thirdSetting' --conf spark.cores.max=8"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in "--driver-java-option" (missing an "s")

@samvantran samvantran force-pushed the DCOS-33686-extraJavaOptions branch from 951535a to e4c0a42 Compare July 3, 2018 01:25
@samvantran
Copy link
Contributor Author

Closing this PR to separate fixing tests and adding functionality

@samvantran samvantran closed this Jul 19, 2018
@samvantran samvantran deleted the DCOS-33686-extraJavaOptions branch September 4, 2018 16:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants