Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker] disable mesos_task and chronos_job tags because of high cardinality #3414

Merged
merged 1 commit into from
Jul 6, 2017

Conversation

xvello
Copy link
Contributor

@xvello xvello commented Jul 5, 2017

What does this PR do?

These tags have a container cardinality.

  • Kept them commented out with a message to explain why
  • Updated unit tests
  • Only marathon_app remains (similar to a kube_deployment)

After 5.15 settles in, and if requested, we could patch BaseUtil to support two cardinality tiers for tags and only use the high card for docker metrics.

@xvello xvello added this to the 5.15 milestone Jul 5, 2017
@xvello xvello requested a review from hkaj July 5, 2017 13:12
@xvello xvello force-pushed the xvello/mesosutil_cardinality branch from 61eafc3 to 1ad44d5 Compare July 5, 2017 13:41
Kept them commented out with a message to explain why
@xvello xvello force-pushed the xvello/mesosutil_cardinality branch from 1ad44d5 to 80aab0b Compare July 6, 2017 09:37
Copy link
Member

@hkaj hkaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@xvello xvello merged commit 28ea87a into master Jul 6, 2017
@xvello xvello deleted the xvello/mesosutil_cardinality branch July 6, 2017 10:08
@maciekm
Copy link

maciekm commented Jul 6, 2017

@xvello thanks for the great job on the mesos integration, waiting for the 5.15 to be released 😃

However could you please reconsider the chronos_job tag? its cardinality is the same as of marathon_app.

sample graph using the chronos_job tag grouped by mesos_task or container_name:
screen shot 2017-07-06 at 12 39 47

BTW: mesos_task cardinality is the same as of the already collected container_name.

@xvello
Copy link
Contributor Author

xvello commented Jul 6, 2017

Hi @maciekm and thank you for your input.

  • could you please send me docker inspect output for a few chronos jobs you are running? My understanding was that this unique for every run, like MESOS_TASK_ID. If not, I'll add it back.
  • container_name is currently collected in metrics from the docker_daemon check, but not for autodiscovery checks, as this would create a lot of contexts. As I stated in the PR, we'd need to patch BaseUtil to allow docker_daemon to query additional high card tags, but as feature freeze is tomorrow, this unfortunately won't make the 5.15 cut

@maciekm
Copy link

maciekm commented Jul 6, 2017

we run chronos 2.4 (not the newest) but even in master:
https://github.com/mesos/chronos/blob/378cbfc393a7348374350e28d4d8c33d50989c85/src/main/scala/org/apache/mesos/chronos/scheduler/mesos/MesosTaskBuilder.scala#L143

"CHRONOS_JOB_NAME" -> job.name,

Its a static value without any dynamic task IDs or timestamps that stays the same across runs.

Redacted sample from docker inspect, changed app name to app1:

            "Env": [
                "CHRONOS_JOB_NAME=app1_process-orders",
                "CHRONOS_JOB_OWNER=",
                "CHRONOS_RESOURCE_CPU=0.5",
                "CHRONOS_RESOURCE_DISK=256.0",
                "CHRONOS_RESOURCE_MEM=1024.0",
                "HOST=mesosslave-private-04ca0786edce2e630",
                "MESOS_CONTAINER_NAME=mesos-88004032-23ef-492f-a8c5-bc3d4e26edb1-S3.b28995b3-7071-47b2-b065-e8e1945be3ce",
                "MESOS_SANDBOX=/mnt/mesos/sandbox",
                "mesos_task_id=ct:1499321640000:0:app1_process-orders:",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ]

Hope the chronos_job tag can get into 5.15 so we could remove our own hack from docker_daemon.py check 😄

@xvello
Copy link
Contributor Author

xvello commented Jul 6, 2017

Thanks @maciekm, I'll add back chronos_job and will add chronos_job_owner (if not empty) to make it up to you :)

@maciekm
Copy link

maciekm commented Jul 6, 2017

😹 that is awesome news, thanks @xvello

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants