-
Notifications
You must be signed in to change notification settings - Fork 118
Add support for publishing Spark metrics into Prometheus #531
Add support for publishing Spark metrics into Prometheus #531
Conversation
This looks like should be submitted to upstream apache git repo instead. |
Hello @tnachen and @foxish - reason I submitted here is that Prometheus is usually the Additionally this is more convenient for me as I am using the Spark-on-K8S fork only and all the tests, etc I did was on K8S (as Prometheus, Pushgateway, etc). Is there any chance it can land here? Ultimately this fork (hopefully) will be merged upstream so the Prometheus based monitoring will be available for all Spark users. |
I'm okay with letting it rest here for the time being, as long as we have a plan to upstream it along with the other parts. |
Yes @foxish I totally agree with you, makes only sense if this gets upstreamed. Meanwhile will fix the checkstyle violation which causes the integration test to fail. |
Yep, since this is orthogonal to the k8s changes made to Spark this should ideally go upstream. @matyix can you please open an issue upstream at https://issues.apache.org/jira/browse/SPARK so we can make prometheus available to all Spark users, not just k8s users? |
Hello @ash211 - I've created the issue in the Apache Jira - https://issues.apache.org/jira/browse/SPARK-22343 - please let me know what are the next steps. |
Hello @matyix, this is a great work! I wanted to use it, but our pushgateway use Thanks in advance. |
FYI: I did "monkey patch" by simply replacing the two hardcoded occurrence of the "http" string with "https" and it works! |
Hello @aloiscochard - I have updated the PR with support for changing protocols. @ash211 @foxish @tnachen - the check-style issue is fixed as well, so now all checks are passing. Let me know if anything else needed to have the PR merged. |
I'm a bit confused from the thread above - I agree with the philosophy that this ought to go upstream independent from spark-on-kube. If so, I'd suggest that this PR be transferred to be directly against apache/spark instead, referencing SPARK-22343 |
@erikerlandson @foxish I've pushed the same PR (without K8S specifics upstream), please find it here. |
What changes were proposed in this pull request?
Publishing Spark metrics into Prometheus - as discussed earlier in #384. Implemented a metrics sink that publishes Spark metrics into Prometheus via Prometheus Pushgateway. Metrics data published by Spark is based on Dropwizard. The format of Spark metrics is not supported natively by Prometheus thus these are converted using DropwizardExports prior pushing metrics to the pushgateway.
Also the default Prometheus pushgateway client API implementation does not support metrics timestamp thus the client API has been ehanced to enrich metrics data with timestamp.
How was this patch tested?
This PR is not affecting the existing code base and not altering the functionality. Nevertheless, I have executed all
unit and integration
tests. Also this setup has been deployed with Helm and running for a few weeks now for several long and short-running workloads on a K8S cluster (on AWS) and been monitored via Prometheus (Prometheus 1.7.1 + Pushgateway 0.3.1).Manual
testing through deploying a Spark cluster, Prometheus server, Pushgateway and ran SparkPi.