Kubernetes Control-Plane Metrics Stability

Summary

Currently metrics emitted in the Kubernetes control-plane do not offer any stability guarantees. This Kubernetes Enhancement Proposal (KEP) proposes a strategy and framework for programmatically expressing how stable a metric is, i.e. whether a metric's name, type and labels (i.e. dimensions) is liable to change. Since this document will likely evolve with ongoing discussion around metric stability, it will be updated accordingly.

Motivation

Metrics stability has been an ongoing community concern. Oftentimes, cluster monitoring infrastructure assumes the stability of at least some control-plane metrics; thus, it would be prudent to offer some sort of guarantees around control-plane metrics, treating it more properly as an API. Since the metrics overhaul is nearing completion, there should be less reason to introduce breaking changes to metrics, making it an opportune time to introduce metric stability rules. Specifically, this KEP intends to address metric stability from an ingestion point of view.

Guarantees around metrics have been proposed previously and there are ongoing community discussions around this issue. Some suggested solutions include:

Having a ‘stable’ metrics endpoint, i.e. ‘/metrics/v1’
Leaving metrics as is and documenting the ones which have a stability guarantee

This KEP suggests another alternative but is very much in line with the spirit of the other proposed solutions.

Goals

Describe the various stability guarantees for the consumption of control-plane metrics.
Define a uniform mechanism for expressing metric stability.

Non-Goals

We are not defining which specific control-plane metrics are actually stable.
We are not providing guarantees around specific values in metrics; as such, breakages in alerting based of off assumptions on specific values in metrics are out-of-scope.
Defining the precise mechanism by which we will validate metric rule violations (i.e. static analysis) is also not in scope.

Background

Kubernetes control-plane binaries (i.e. scheduler, kubelet, controller-manager, apiserver) use a Prometheus client to export binary-specific metrics to a ‘/metrics’ endpoint in Prometheus format. Metrics are first defined and then instantiated; later they are registered to a metrics registry. The http handler for the metrics endpoint then delegates responses to the underlying registry.

For the remainder of this document, I will refer to the following terms by these definitions:

metric definition - this refers to defining a metric. In Kubernetes, we use the standard Prometheus pattern of using an options struct to define name, type, description of a metric.
metric instantiation - this refers to creating an instance of a metric. A metric definition is passed into a metric constructor which, in Kubernetes, is a Prometheus metric constructor (example).
metric enrollment - after being defined and created, individual metrics are officially enrolled to a metrics registry (currently a global one).
metric registration process - I use this to refer to the entire lifecycle of a metric from definition, to instantiation, then enrollment.

The fact that the metric registration process always involves these steps is significant because it allows for the possibility of injecting custom behavior in and around these steps.

Proposal

This KEP proposes a programmatic mechanism to express the stability of a given control plane metric. Individual metrics would be quasi-versioned, i.e. they would have additional bits of metadata which would indicate whether that metric was alpha (not-stable), stable, or deprecated. Metric stability guarantees would depend on the values of those additional bits. This information would be represented to users in two primary ways: (1) metric stability information would be automatically to individual metrics help descriptions and (2) in the case of deprecated metrics, warning logs would be emitted on metric registration.

Specifically, this would involve injecting custom behavior during metric registration process by wrapping metric definition, instantiation and enrollment.

Metric Definition Phase

Currently, the metric definition phase looks like this:

var someMetricDefinition = prometheus.CounterOpts{
    Name: "some_metric",
    Help: "some description",
}

Since we are using the Prometheus provided struct, we are constrained to Prometheus provided fields. However, using a custom struct affords us the following:

var deprecatedMetricDefinition = kubemetrics.CounterOpts{
    Name: "some_deprecated_metric",
    Help: "some description",
    StabilityLevel: kubemetrics.STABLE, // this is also a custom metadata field
    DeprecatedVersion: "1.15", // this is a custom metadata field
}

var alphaMetricDefinition = kubemetrics.CounterOpts{
    Name: "some_alpha_metric",
    Help: "some description",
    StabilityLevel: kubemetrics.ALPHA, // this is also a custom metadata field
	DeprecatedVersion: "1.15", // this can optionally be included on alpha metrics, although there is no change to contractual stability guarantees
}

Metric Instantiation Phase

Currently, the metric instantiation phase looks like this:

var someCounterVecMetric = prometheus.NewCounterVec(
    someMetricDefinition,
    []string{"some-label", "other-label"},
}

Wrapping the Prometheus constructors would allow us to take, as inputs, the modified metric definitions defined above, returning a custom Kubernetes metric object which contains the metric which would have been instantiated as well as the custom metadata:

var deprecatedMetric = kubemetrics.NewCounterVec( // this is a wrapped initializer, which takes in our custom metric definitions
    deprecatedMetricDefinition, // this is our custom wrapped metric definition from above
    []string{"some-label", "other-label"},
}
var alphaMetric = kubemetrics.NewCounterVec{
    alphaMetricDefinition, // this is also our custom wrapped metric definition from above
    []string{"some-label", "other-label"},
}

Metric Enrollment Phase

Currently, metric enrollment involves calls to a Prometheus function which enrolls the metric in a global registry, like so:

prometheus.MustRegister(someCounterVecMetric)

Wrapping a prometheus registry with a Kubernetes specific one, would allow us to take our custom metrics from our instantiation phase and execute custom logic based on our custom metadata. Our custom registry would hold a reference to a prometheus registry and defer metric enrollment unless preconditions were met:

import version "k8s.io/apimachinery/pkg/version"

type Registry struct {
    promregistry *prometheus.Registry
    KubeVersion version.Info
}

// inject custom registration behavior into our registry wrapper
func (r *Registry) MustRegister(metric kubemetrics.Metric) {
    // pretend we have a version comparison utility library
    if metricutils.compare(metric.DeprecatedVersion).isLessThan(r.KubeVersion) {
        // check if binary has deprecated metrics enabled otherwise
        // no-op registration
        return
    } else if metricutils.compare(metric.DeprecatedVersion).isEqual(r.KubeVersion) {
        // append deprecated text to description
        // emit warning in logs
        // continue to actual registration
    }
    // append alpha text to metric description if metric.isAlpha
    // fallback to original prometheus behavior
    r.promregistry.MustRegister(metric.realMetric)
}

Which we would invoke, like so:

kubemetrics.MustRegister(deprecatedMetric)
kubemetrics.MustRegister(alphaMetric)

Stability Classes

This proposal introduces two stability classes for metrics: (1) Alpha, (2) Stable. These classes are intended to make explicit the API contract between the control-plane and the consumer of control-plane metrics.

Alpha metrics have no stability guarantees; as such they can be modified or deleted at any time. At this time, all Kubernetes metrics implicitly fall into this category.

Stable metrics can be guaranteed to not change, except that the metric may become marked deprecated for a future Kubernetes version. By not change, we mean three things:

the metric itself will not be deleted (or renamed)
the type of metric will not be modified
no labels can be added or removed from this metric

From an ingestion point of view, it is backwards-compatible to add or remove possible values for labels which already do exist (but not labels themselves). Therefore, adding or removing values from an existing label is permissible. Stable metrics can also be marked as deprecated for a future Kubernetes version, since this is a metadata field and does not actually change the metric itself.

Removing or adding labels from stable metrics is not permissible. In order to add/remove a label to an existing stable metric, one would have to introduce a new metric and deprecate the stable one; otherwise this would violate compatibility agreements.

As an aside, all metrics should be able to be individually disabled by the cluster administrator, regardless of stability class. By default, all non-deprecated metrics will be automatically registered to the metrics endpoint unless explicitly blacklisted via a command line flag (i.e. '--disable-metrics=somebrokenmetric,anothermetric').

API Review

Graduating a metric to a stable state is a contractual API agreement, as such, it would be desirable to require an api-review (to sig-instrumentation) for graduating or deprecating a metric (in line with current Kubernetes api-review processes). However, initiating or mandating such API review has historically been problematic for sig-instrumentation since, while a horizontal effort, is not automatically added as reviewers for metrics related changes.

One possible solution is through something similar to the existing Kubernetes conformance test gates (thanks @liggitt for pointing this one out). We will have a script which can generate a list of the current stable metrics (via static analysis). This list will be checked in. During the CI/CD flow, a verify script will run to generate a new list of stable metrics. If there is a diff present, then the verify script will fail, since the file should be updated and checked in. Thus, the file must be checked in and since the file will live in a directory owned by sig-instrumentation, sig-instrumentation approval on that PR will be required.

Deprecation Lifecycle

This proposal introduces deprecation metadata for metrics, to be used to define a deprecation lifecycle. Metrics can be annotated with a Kubernetes version, from which point that metric will be considered deprecated. This allows us to indicate that a metric is slated for future removal and provides the consumer a reasonable window in which they can make changes to their monitoring infrastructure which depends on this metric.

While deprecation policies only actually change stability guarantees for stable metrics (and not alpha ones), deprecation information may however be optionally provided on alpha metrics to help component owners inform users of future intent, to help with transition plans (this change was made at the request of @dashpole, who helpfully pointed out that it would be nice to be able signal future intent even for alpha metrics).

When a stable metric undergoes the deprecation process, we are signaling that the metric will eventually be deleted. The lifecyle looks roughly like this (each stage represents a Kubernetes release):

Stable metric -> Deprecated metric -> Hidden metric -> Deletion

Deprecated metrics have the same stability guarantees of their counterparts. If a stable metric is deprecated, then a deprecated stable metric is guaranteed to not change. When deprecating a stable metric, a future Kubernetes release is specified as the point from which the metric will be considered deprecated.

var someCounter = kubemetrics.CounterOpts{
    Name: "some_counter",
    Help: "this counts things",
    StabilityLevel: kubemetrics.STABLE,
    DeprecatedVersion: "1.15", // this metric is deprecated when the Kubernetes version == 1.15
}

Deprecated metrics will have their description text prefixed with a deprecation notice string '(Deprecated from x.y)' and a warning log will be emitted during metric registration (in the spirit of the official Kubernetes deprecation policy).

Before deprecation:

# HELP some_counter this counts things
# TYPE some_counter counter
some_counter 0

During deprecation:

# HELP some_counter (Deprecated from 1.15) this counts things
# TYPE some_counter counter
some_counter 0

Like their stable metric counterparts, deprecated metrics will be automatically registered to the metrics endpoint.

On a subsequent release (when the metric's deprecatedVersion is equal to current_kubernetes_version - 1)), a deprecated metric will become a hidden metric. Unlike their deprecated counterparts, hidden metrics will no longer be automatically registered to the metrics endpoint (hence hidden). However, they can be explicitly enabled through a command line flag on the binary (i.e. '--show-hidden-metrics-for-version='). This is to provide cluster admins an escape hatch to properly migrate off of a deprecated metric, if they were not able to react to the earlier deprecation warnings. Hidden metrics should be deleted after one release.

Show Hidden Metrics

As described above, admins can enable hidden metrics through a command-line flag on a specific binary. This intends to be used as an escape hatch for admins if they missed the migration of the metrics deprecated in the last release.

The flag show-hidden-metrics-for-version takes a version for which you want to show metrics deprecated in that release. The version is expressed as x.y, where x is the major version, y is the minor version. The patch version is not needed even though a metrics can be deprecated in a patch release, the reason for that is the metrics deprecation policy runs against the minor release.

The flag can only take the previous minor version as it's value. All metrics hidden in previous will be emitted if admins set the previous version to show-hidden-metrics-for-version. The too old version is not allowed because this violates the metrics deprecated policy.

Take metric A as an example, here assumed that A is deprecated in 1.n. According to metrics deprecated policy, we can reach the following conclusion:

In release 1.n, the metric is deprecated, and it can be emitted by default.
In release 1.n+1, the metric is hidden by default and it can be emitted by command line show-hidden-metrics-for-version=1.n.
In release 1.n+2, the metric should be removed from the codebase. No escape hatch anymore.

So, if admins want to enable metric A in release 1.n+1, they should set 1.n to the command line flag. That is show-hidden-metrics=1.n.

Why Not Bool Flag

Alternatively, another solution which was previously suggested(refer to the discussion on PR) was provide a bool flag-like show-hidden-metrics. That works like:

show-hidden-metrics=true: enable all hidden metrics deprecated in a previous minor version.
show-hidden-metrics=false: the default value, do nothing.

This proposal has a side effect(thanks for @lavalamp pointed it out) in the scenario:

in version X, turn this on to get back deprecated metric M
in version Y, not turn this off, and therefore fail to notice metric N is being deprecated
in version Z, metric N is removed with (effectively) no warning

Design Details

Test Plan

Internal wrappers can be unit tested. There has been some discussion around providing APIs around metric definitions, hopefully with the end goal of being able to augment our test strategy with current and historical metric definition data.

Graduation Criteria

This feature enhancement will not require graduation criteria. We intend to isolate these changes into a sig-instrumentation repo, build out the framework and the static analysis validation piece (which will require it's own KEP) and then have another KEP which details migration strategy across the kubernetes codebase. Since this can be built in place without affecting any surrounding code, it is safe.

Drawbacks

More generally, this proposal has the drawbacks which any proposal suggesting a more rigorous enforcement of an API is going to have. There is always a tradeoff between the ease at which a developer can make breaking changes to an API, with consumers' ability to reliably use that API.

Relative to a more hands-off approach, like one where we just document the metrics which the community has agreed to 'certify' as stable, this approach is definitely more heavyweight. This approach involves more code and more code is more maintenance. However, most of the code will be centralized and the internal logic is easily unit-testable. We also do not have to worry much about changing internal API semantics, since our wrappers will be used internally only, which means it should be easy to modify for new usecases in the future. This sort of approach also enables static analysis tooling around metrics which we could run in precommit.

Also, we should note that this approach can be manufactured in-place; this framework could be rolled out without actually introducing any backwards-incompatible changes (unlike moving stable metrics to a '/metrics/v1' endpoint).

There is also some inflexibility in responding to the situation where code is re-architected in such a way that it's no longer feasible to provide a metric (e.g. there's no longer anything to measure). Generally, we would want to try to avoid this situation by not making a metric stable if there's any way for it to get refactored away. Currently, in this sort of case, the metrics stability proposal would only dictate that we continue to register the metric and undergo the normal metric deprecation policy, as it would be necessary for avoiding ingestion pipeline breakages (thanks @DirectXMan12 for pointing this out).

Alternatives

Using a more traditional versioned endpoint was one of the first suggested ideas. However, metrics basically form a single API group so making a change to a single (previously considered stable) metric would necessitate a version bump for all metrics. In the worst case, version bumps for metrics could occur with each release, which is undesirable from a consumption point of view.

It would also be possible to group metrics into distinct endpoints, in order to avoid global version bumps. However, this breaks the more common metrics ingestion patterns, i.e. as a consumer of metrics for a component, you would no longer be able to assume all of your relevant metrics come from one location. This is also a potentially confusing pattern for consumer of metrics, since you would have to manage a series of metrics for a given component and also be cognizant of the version for each of these components. It would be easy to get wrong.

Alternatively, one lightweight solution which was previously suggested was documenting the metrics which have stability guarantees. However, this is prone to documentation rot and adds manual (and error-prone) overhead to the metrics process.

Unresolved Questions

Static Analysis for Validation

resolved

Beta Stability Level

discussed during stability-to-beta, and decided it wasn't necessary

Prometheus Labels vs OpenCensus-type Tags

Having these series of wrappers in place allows us to potentially provide a custom wrapper struct around Prometheus labels. This is particularly desirable because labels are shared across metrics and we may want to define uniform behavior for a given label (constraining values for labels, whitelisting values for a label). Prometheus labels are pretty primitive (i.e. lists of strings) but potentially we may want an abstraction which more closely resembles open-census tags.

Dynamically Registered Metrics

Metrics which are added dynamically after application boot - Metrics which are dynamically added depending on things which occur during runtime should probably not be allowed to be considered stable metrics, since we can't rely on them to exist reliably.

Implementation History

Wrappers added around prometheus registry and counter/counterVec metric type PR-77037
Wrappers added around prometheus gauge, histogram & summary metric types PR-77618
Make sig-instrumentation owners for component-base metrics directory PR-77621
Allow explicit disabling of metrics as an escape hatch

References

Metric Renaming

Metric renaming is be tantamount to deleting a metric and introducing a new one. Accordingly, metric renaming will also be disallowed for stable metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes-control-plane-metrics-stability.md

kubernetes-control-plane-metrics-stability.md

Kubernetes Control-Plane Metrics Stability

Table of Contents

Summary

Motivation

Goals

Non-Goals

Background

Proposal

Metric Definition Phase

Metric Instantiation Phase

Metric Enrollment Phase

Stability Classes

API Review

Deprecation Lifecycle

Show Hidden Metrics

Why Not Bool Flag

Design Details

Test Plan

Graduation Criteria

Drawbacks

Alternatives

Unresolved Questions

Static Analysis for Validation

Beta Stability Level

Prometheus Labels vs OpenCensus-type Tags

Dynamically Registered Metrics

Implementation History

References

Metric Renaming

Files

kubernetes-control-plane-metrics-stability.md

Latest commit

History

kubernetes-control-plane-metrics-stability.md

File metadata and controls

Kubernetes Control-Plane Metrics Stability

Table of Contents

Summary

Motivation

Goals

Non-Goals

Background

Proposal

Metric Definition Phase

Metric Instantiation Phase

Metric Enrollment Phase

Stability Classes

API Review

Deprecation Lifecycle

Show Hidden Metrics

Why Not Bool Flag

Design Details

Test Plan

Graduation Criteria

Drawbacks

Alternatives

Unresolved Questions

Static Analysis for Validation

Beta Stability Level

Prometheus Labels vs OpenCensus-type Tags

Dynamically Registered Metrics

Implementation History

References

Metric Renaming