-
Notifications
You must be signed in to change notification settings - Fork 118
Support MetricNameSpace in executors and include shuffle to shuffleservice metrics #532
Support MetricNameSpace in executors and include shuffle to shuffleservice metrics #532
Conversation
Unit test build for this PR seems stuck for almost 3 days. |
@ssuchter @kimoonkim, should the unit test build timeout eventually? |
Some test failed due to OOM. Can someone who has admin access to the Jenkins instance kill the build? |
@foxish I think we can time out the Jenkins jobs, say, after 8 hours. I'll see if i can make this change to Jenkins jobs. @liyinan926 Yes, let me find and kill the hanging build. |
I just checked. The build was killed already by Jenkins. The build log console says:
|
Hmm, also the unit test had set up timeout already. The config page says:
I guess it doesn't always work :-( |
@liyinan926 @kimoonkim @foxish Sorry forgot to update, but I killed the test earlier and added a 60 minute timeout to the configuration around noon today. |
rerun all tests please |
I'm a little confused as to what this property actually does. I see the configuration of |
@@ -201,7 +202,8 @@ private[spark] object CoarseGrainedExecutorBackend extends Logging { | |||
clientMode = true) | |||
val driver = fetcher.setupEndpointRefByURI(driverUrl) | |||
val cfg = driver.askSync[SparkAppConfig](RetrieveSparkAppConfig(executorId)) | |||
val props = cfg.sparkProperties ++ Seq[(String, String)](("spark.app.id", appId)) | |||
val props = cfg.sparkProperties ++ Seq[(String, String)](("spark.app.id", appId), | |||
("spark.metrics.namespace", metricsNamespace)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the driver provide this as it would with any other Spark property? i.e. if I set spark.metrics.namespace
in my SparkConf then shouldn't that be what's used here?
…le as that is pulled from driver spark conf
Hello @mccheah - thanks for the feedback and review. The starting base for the PR was an earlier version and the custom metrics namespace in executors was missing back then. I have reverted the changes and updated the PR text as well reflecting the latest commit. |
Is the idea here to backport the spark support for custom metric namespaces to this fork? I ask because I'd expect this to be resolved either via the upstreaming process or via our next rebase. |
@erikerlandson After rebasing with latest branch-2.2-kubernetes the original PR has reduced the scope to support custom metrics namespace for |
@matyix Oh I see - I was referring to the next time we rebase against an upstream release (spark-2.2.1 or spark-2.3, etc) |
@erikerlandson shall I push this PR upstream as well? |
Closing this as it's redundant, this PR apache#19775 fixes this one as well. |
apache-spark-on-k8s#532) * [SPARK-25299] Use the shuffle writer plugin for the SortShuffleWriter. * Remove unused * Handle empty partitions properly. * Adjust formatting * Don't close streams twice. Because compressed output streams don't like it. * Clarify comment
apache-spark-on-k8s#532) * [SPARK-25299] Use the shuffle writer plugin for the SortShuffleWriter. * Remove unused * Handle empty partitions properly. * Adjust formatting * Don't close streams twice. Because compressed output streams don't like it. * Clarify comment
What changes were proposed in this pull request?
Support MetricNameSpace in
shuffle
to shuffleservice metrics.Pass
spark.metrics.namespace
configuration to shuffle such as the shuffleservice can publish metrics with custom namespace. In case of shuffleservice theshuffle
prefix is used in the metrics.How was this patch tested?
Executed all unit and integration tests.
Manual testing through deploying a Spark cluster, Prometheus server, Pushgateway and ran SparkPi - and checking for the shuffle metrics in Pushgateway/Grafana.