-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store: add metrics for VerticalPodAutoscaler objects #791
store: add metrics for VerticalPodAutoscaler objects #791
Conversation
Welcome @milesbxf! |
4eac749
to
ef2e5f4
Compare
There is an available and DefaultCollectors. For it not to be enabled by default, you want to only add it to the available, but not the default map. |
@brancz yep, I did that; however not having the
Is this expected? Should I raise another issue? |
ef2e5f4
to
0f4ed6e
Compare
I've run into another obstacle - for the |
I'm ok to not have them be part of e2e for now, especially as the metrics are marked as experimental. This would out of other things be a criteria for graduating them to stable metrics. You're right about the error, that's a problem, and this code is wrong: kube-state-metrics/pkg/options/types.go Lines 87 to 91 in 1d5b808
It shouldn't be checking on the kube-state-metrics/internal/store/builder.go Line 115 in 4ee79c5
Feel free to introduce a function or export or move code as needed to make this work. Important is that the model of available collectors vs default collectors ends up the way we initially expected 🙂 . |
internal/store/vpa.go
Outdated
) | ||
|
||
var ( | ||
descVerticalPodAutoscalerLabelsName = "kube_vpa_labels" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write out vpa, this is done for all metrics in kube-state-metrics (and those that are not are targeted to be changed for a breaking release)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear - you mean rename kube_vpa
-> kube_verticalpodautoscaler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
internal/store/vpa.go
Outdated
min := c.MinAllowed | ||
if cpu, ok := min[v1.ResourceCPU]; ok { | ||
ms = append(ms, &metric.Metric{ | ||
LabelKeys: []string{"container_name"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be container
, same for all usages below
internal/store/vpa.go
Outdated
ms = append(ms, &metric.Metric{ | ||
LabelKeys: []string{"container_name"}, | ||
LabelValues: []string{c.ContainerName}, | ||
Value: float64(cpu.MilliValue()) / 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the resource type can do the conversion internally, I don't think we have to do the math here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I take the float64 conversion off I get:
cannot use val.MilliValue() / 1000 (type int64) as type float64 in field value
internal/store/vpa.go
Outdated
}), | ||
}, | ||
{ | ||
Name: "kube_vpa_container_status_recommendation_lower_bound_memory_bytes", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for resource requests/limits on containers/pods we made this generic, and deprecated the non generic metrics, we should have just the generic one here
internal/store/vpa.go
Outdated
}), | ||
}, | ||
{ | ||
Name: "kube_vpa_container_status_recommendation_target_cpu_cores", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as lower bound, this should be generic for resource types
internal/store/vpa.go
Outdated
}), | ||
}, | ||
{ | ||
Name: "kube_vpa_container_status_recommendation_uncapped_target_cpu_cores", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
internal/store/vpa.go
Outdated
vpa := obj.(*autoscaling.VerticalPodAutoscaler) | ||
|
||
metricFamily := f(vpa) | ||
targetRef := fmt.Sprintf("%s/%s/%s", vpa.Spec.TargetRef.APIVersion, vpa.Spec.TargetRef.Kind, vpa.Spec.TargetRef.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not have this be separate labels?
internal/store/vpa.go
Outdated
} | ||
|
||
func createVPAListWatchFunc(kubeCfg *rest.Config) func(kubeClient clientset.Interface, ns string) cache.ListerWatcher { | ||
vpaClient, err := vpaclientset.NewForConfig(kubeCfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this creation should be on the same level as we create the kubeclient, so we can actually properly handle the resulting error
internal/store/vpa_test.go
Outdated
|
||
func TestVPAStore(t *testing.T) { | ||
|
||
const metadata = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the metadata should be tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put in the metadata, but I'm not sure where to reference it in the tests - in most of the other collectors we just define it as a const
but don't actually do anything with it
main.go
Outdated
} | ||
klog.Infof("Running with Kubernetes cluster version: v%s.%s. git version: %s. git tree state: %s. commit: %s. platform: %s", | ||
v.Major, v.Minor, v.GitVersion, v.GitTreeState, v.GitCommit, v.Platform) | ||
klog.Infof("Communication with server successful") | ||
|
||
return kubeClient, nil | ||
return kubeClient, config, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create and return both the kubeclient and vpaclient (if enabled) here I would say
3e8ef4c
to
6d9eb11
Compare
@brancz @tariq1890 I think I've addressed all your review comments 👍 |
Haven't had an in-depth look, but I think |
Hooray, finally got CI passing 🎉 @tariq1890 the As for e2e, there doesn't seem to be any check as to whether the resource is enabled or not (though perhaps that would be a smarter approach than I've got here) - it simply lists out files under the
|
{ | ||
Name: "kube_verticalpodautoscaler_update_mode", | ||
Type: metric.Gauge, | ||
Help: "Update mode of the VPA.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a preference for vertical pod autoscaler
over vpa
in the metrics help text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Also, need to squash commit history. |
5b1a7e5
to
ef90e80
Compare
Squashed, so it's ready for @brancz to look at 🙂 |
Can this be merged @brancz? |
Still reviewing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going in the right direction, some naming scheme and docs things. Overall the approach looks sane and in line with the rest of kube-state-metrics. Thanks a lot for taking care of this!
| kube_verticalpodautoscaler_container_resource_policy_max | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_container_resource_policy_min | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_container_resource_policy_min | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_container_status_recommendation_lower_bound | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all kube_verticalpodautoscaler_container_status_.*
metric docs are duplicate
|
||
| Metric name | Metric type | Labels/tags | Status | | ||
| -------------------------------- | ----------- | ------------------------------------------------------------- | ------ | | ||
| kube_verticalpodautoscaler_container_resource_policy_max | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all kube_verticalpodautoscaler_container_resource_policy.*
metrics are duplicate
| kube_verticalpodautoscaler_container_status_recommendation_upper_bound | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_container_status_recommendation_upper_bound | Gauge | `container`=<container name> <br> `namespace`=<namespace> <br> `resource`=<cpu | memory> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `unit`=<core | byte> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_labels | Gauge | `label_app`=<foo> <br> `namespace`=<namespace> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | | ||
| kube_verticalpodautoscaler_update_mode | Gauge | `namespace`=<namespace> <br> `target_api_version`=<api version> <br> `target_kind`=<target kind> <br> `target_name`=<target name> <br> `update_mode`=<foo> <br> `verticalpodautoscaler`=<vertical pod autoscaler name> | EXPERIMENTAL | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this metric is documented 4 times
}), | ||
}, | ||
{ | ||
Name: "kube_verticalpodautoscaler_container_resource_policy_min", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should stay as close as possible to the original naming, so this should be kube_verticalpodautoscaler_resourcepolicy_containerpolicies
.
}), | ||
}, | ||
{ | ||
Name: "kube_verticalpodautoscaler_container_resource_policy_max", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming same scheme as kube_verticalpodautoscaler_container_resource_policy_min
}), | ||
}, | ||
{ | ||
Name: "kube_verticalpodautoscaler_container_status_recommendation_lower_bound", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming same scheme as kube_verticalpodautoscaler_container_resource_policy_min
, although unfortunately due to the API definition, this is gonna end up a bit weird, but I'd still prefer to stick to systematic naming, so something like: kube_verticalpodautoscaler_status_recommendation_containerrecommendations_lower_bound
LabelValues: []string{containerName, sanitizeLabelName(string(resourceName)), string(constant.UnitCore)}, | ||
Value: float64(val.MilliValue()) / 1000, | ||
}) | ||
case v1.ResourceStorage: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it really necessary to whitelist each type here? can't we just normalize to base units in any case and put the resource name in the unit
label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brancz I'm a little conflicted about this - I'm actually tempted to remove the Storage
cases (if I'm honest, I copied and pasted this from the kube_pod_container_resource_requests
).
At the moment the VPA only operates on CPU and memory, though there's a discussion about adding ephemeral storage (kubernetes/autoscaler#1751). Those are the only metrics exposed by the Metrics Server AFAIK, so it's unlikely we'd need to whitelist every type of resource.
I'm also not sure how we'd generically normalize to base units - I'd consider the base unit of CPU to be a core
, and the resource.Quantity
only provides an int
Value for that 😢
Personally I'm in favour of keeping this consistent with how kube_pod_container_resource_requests
works, though I don't feel strongly about this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I was just thinking if we can find a default case similar to kube_pod_container_resource_requests
. But I'm also happy to just add it when we're actually at a point where this is a reality.
ef90e80
to
67ef5ec
Compare
Thanks. I've updated the metric names to match the api spec path as closely as possible 👍 |
Wonderful. Thanks for bearding with me :) /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: brancz, milesbxf The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Adds support for metrics on
VerticalPodAutoscaler
objects. It was discussed in #781 whether this should be in the vertical-pod-autoscaler project, or a separate controller, but @brancz pointed out that the VPA is core Kubernetes, even if it isn't installed in every cluster.I've based this off the
v1beta2
of the VPA API -v1
has just been added to master, so can update to that once a new release of VPA has been cut.One tricky thing I ran into is that I needed to use the custom VPA clientset, which requires a
*rest.Config
object to create. I've put forward the cleanest way of exposing this I can think of, but please let me know if an alternative is preferred.Sample metric output
Which issue(s) this PR fixes
Fixes #781
TODO
verticalpodautoscalers
collector not a default