Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cf task metrics #56

Merged
merged 2 commits into from
Jul 12, 2023
Merged

add cf task metrics #56

merged 2 commits into from
Jul 12, 2023

Conversation

risicle
Copy link
Contributor

@risicle risicle commented Jul 7, 2022

Hi,

We noticed cf_exporter was missing metrics for cf tasks. This information would have proved useful when we had a tenant spinning up nightly-but-never-ending tasks using ever accumulating amounts of resources.

Firstly this required a bump to the vendored go-cfclient to include the Relationships field in Task structs, added in cloudfoundry/go-cfclient#309

Secondly note how this collector is a little different from others because it performs some amount of aggregation up-front. This is because all potential labels that would allow identification of individual tasks are likely to be high cardinality ephemeral identifiers which could lead to the dreaded "cardinality explosion".

Of the information we can access about a task, the guid is unique for each task launched - an hourly task running 24/7 would produce a lot of these. sequence_id on its own is low-ish cardinality but when multiplied against the application guids would cause similar problems. A new droplet_guid is generated every time an application is re-staged (correct me?), and so a team doing a release a day would again start to generate a lot of unique combinations. Though a task's name could be heavily reused, in practise I suspect most users fall back to the default, which is randomly generated.

So that just leaves us with application_id and state as labels.

This uses the v3 task interface, so this is hidden behind the "v3 enabled" flag much like IsolationSegmentsCollector.

@ArthurHlt
Copy link

Hello, those mettrics are available in firehose_exporter, for the use case you mentioned (when you have task never ending), you can use metric firehose_value_metric_cc_tasks_running_count for example.

you can also get the drop-in replacement (more close too a drop-in) of firehose_exporter in PR here cloudfoundry/firehose_exporter#63 if you have a big cluster (we use it in production)

@risicle
Copy link
Contributor Author

risicle commented Jul 7, 2022

I may be missing something, but I don't think those firehose metrics would have allowed us to break it down by application, and thus ultimately discover its owner.

@risicle
Copy link
Contributor Author

risicle commented Jul 15, 2022

☝️ Have added another metric, the created_at of the oldest task per group. This allows long-running tasks to be spotted while still avoiding the high-cardinality labels that would be needed to identify individual tasks.

@risicle
Copy link
Contributor Author

risicle commented Oct 31, 2022

Have rebased following the merge of #57.

Most of what I've written above is still applicable.

This now provisionally depends on cloudfoundry/cli#2343 to allow me to make calls to the /tasks endpoint using MakeListRequest. As a temporary measure to ease review I've manually hacked the change in to the vendored dependency but would need to bump our cli dependency to a version that included that before merging.

filters/filters.go Outdated Show resolved Hide resolved
models/model.go Show resolved Hide resolved
@psycofdj
Copy link
Contributor

psycofdj commented Nov 9, 2022

thanks @risicle for this very usefull contribution !

@risicle risicle force-pushed the ris-cf-tasks branch 2 times, most recently from 2b9e2e6 to 176d954 Compare November 10, 2022 13:36
Copy link
Member

@a-b a-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@risicle
Copy link
Contributor Author

risicle commented Jan 19, 2023

I think I've thoroughly failed to get cloudfoundry/cli#2343 merged, so if anyone ever wants to see this functionality in cf exporter they're probably going to have to take on the task (pun intended) of getting that merged themselves.

@github-actions
Copy link

This PR is stale because it has been open 60 days with no activity. Comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Apr 26, 2023
@github-actions
Copy link

github-actions bot commented May 1, 2023

This PR was automatically closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this May 1, 2023
@benjaminguttmann-avtq
Copy link
Contributor

@risicle it happened, your CLI PR is merged :D

@risicle
Copy link
Contributor Author

risicle commented Jun 9, 2023

I noticed - I'm still in shock.

Have to rebase this now...

risicle added 2 commits June 9, 2023 11:10
this is a version which is GetTasksRequest-aware
pre-aggregated to avoid exposing any transient and therefore
high cardinality labels

oldest_created_at allows long-running tasks to be spotted while
still avoiding the high-cardinality labels that would be needed
to identify individual tasks

disable by default for its initial introduction to avoid
nasty surprises
@risicle risicle marked this pull request as ready for review June 9, 2023 12:30
@risicle
Copy link
Contributor Author

risicle commented Jun 9, 2023

Tested again running as a cf app, works fine.

@github-actions github-actions bot removed the Stale label Jun 10, 2023
@risicle risicle requested a review from psycofdj June 12, 2023 10:30
@psycofdj psycofdj requested a review from mdimiceli June 15, 2023 08:14
@psycofdj psycofdj requested review from gmllt and removed request for psycofdj June 15, 2023 08:14
@psycofdj psycofdj dismissed their stale review June 15, 2023 08:22

outdated, requested review from gmlt and/or mdimiceli

Copy link
Contributor

@mdimiceli mdimiceli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @risicle !

Copy link
Member

@gmllt gmllt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me, thanks @risicle for the contribution

@risicle
Copy link
Contributor Author

risicle commented Jul 7, 2023

🎂 Happy anniversary! 🎂

@mdimiceli mdimiceli merged commit ddb3176 into cloudfoundry:master Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants