You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The k8scluster receiver currently provides metrics for resource requests and limits for containers. For most use cases the effective pod resource requirements end up being equal to the sum of the request/limit of all main containers in the pod. But k8s components like scheduler, kubelet, etc. use a more complicated calculation for the effective resource requirement for running a pod. For k8s version without the sidecar and in-place resize feature, the effective request/limit for a resource is calculated as max ( max(init containers), sum(containers) ) + pod_overhead. The full algorithm which also takes into account the nuances of additional feature for latest k8s version can be found here.
For e.g. I have a pod in the screenshot with initContainer set to cpu requests (150m) > cpu request of the main container (100m). And you can see kubectl describe node output shows the reserved cpu on the node for the pod is 150m.
An admin might want to monitor patterns like this when pods end up reserving resources for initialization that are not used during the life of the pod. Being able to track the effective pod request/limit is useful when trying to track the capacity of the node as seen by the scheduler.
The receiver should have the option to collect request/limit for initContainers and the pod overhead.
We could additionally discuss the feasibility of computing the effective pod request/limit in the receiver the same way the scheduler does which might be difficult to implement and maintain, since the receiver won’t have access to the enabled k8s feature gates like the scheduler, and we’ll need to keep the computations in the receiver in-sync with changes to k8s.
For the request/limits for init containers, I think it makes sense to differentiate these metrics from those for main containers since users might want to filter out init containers. We could either change the metric name to reflect the different types of container in a pod e.g. k8s.initcontainer.* or add an attr like k8s.container.type. Having separate metric name seems better for user because these can be enabled/disabled in the receiver config interface easily.
Additional consideration when naming the metrics would be the new sidecar-type initContainer metrics being discussed in this issue.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Component(s)
receiver/k8scluster
Is your feature request related to a problem? Please describe.
The k8scluster receiver currently provides metrics for resource requests and limits for containers. For most use cases the effective pod resource requirements end up being equal to the sum of the request/limit of all main containers in the pod. But k8s components like scheduler, kubelet, etc. use a more complicated calculation for the effective resource requirement for running a pod. For k8s version without the sidecar and in-place resize feature, the effective request/limit for a resource is calculated as
max ( max(init containers), sum(containers) ) + pod_overhead
. The full algorithm which also takes into account the nuances of additional feature for latest k8s version can be found here.For e.g. I have a pod in the screenshot with initContainer set to cpu requests (150m) > cpu request of the main container (100m). And you can see
kubectl describe node
output shows the reserved cpu on the node for the pod is 150m.An admin might want to monitor patterns like this when pods end up reserving resources for initialization that are not used during the life of the pod. Being able to track the effective pod request/limit is useful when trying to track the capacity of the node as seen by the scheduler.
Describe the solution you'd like
The easiest and most accurate way to get the effective pod req/limit is by scraping the metrics
kube_pod_resource_request
andkube_pod_resource_limit
from kube-scheduler but this might not be an option for users with managed clusters.The receiver should have the option to collect request/limit for initContainers and the pod overhead.
We could additionally discuss the feasibility of computing the effective pod request/limit in the receiver the same way the scheduler does which might be difficult to implement and maintain, since the receiver won’t have access to the enabled k8s feature gates like the scheduler, and we’ll need to keep the computations in the receiver in-sync with changes to k8s.
Proposed new metrics for pod overhead -
For the request/limits for init containers, I think it makes sense to differentiate these metrics from those for main containers since users might want to filter out init containers. We could either change the metric name to reflect the different types of container in a pod e.g.
k8s.initcontainer.*
or add an attr likek8s.container.type
. Having separate metric name seems better for user because these can be enabled/disabled in the receiver config interface easily.Additional consideration when naming the metrics would be the new sidecar-type initContainer metrics being discussed in this issue.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: