From 2ed424321b2fa80bcea998fc2226175ad46aa957 Mon Sep 17 00:00:00 2001 From: Tigran Najaryan Date: Thu, 7 Jul 2022 15:35:56 -0400 Subject: [PATCH] Introduce "split" metric schema transformation This is a new transformation type that allows to describe a change where a metric is converted to several other metrics by eliminating an attribute. An example of such change that happened recently is this: https://github.com/open-telemetry/opentelemetry-specification/pull/2617 This PR contains: - A new "split" transformation added to the schema file format. - Schema file format incremented from 1.0.0 to 1.1.0. This is considered a backward compatible change and thus the minor version number is incremented. - Deleted incorrect sentence that claimed that the addition of a new transformation type is not backward compatible and requires a major version number change. This is incorrect. Addition of new transformation types is backward compatible according to our current definition of backward compatibility, which is: "consumers can consume version X.Y and older versions X.Z provided that they are aware of version X.Y". - Deleted the requirement that a full OTEP is always necessary to introduce new transformation types. This seems excessive. I think in simple cases like this a PR directly in the spec is sufficient. This draft PR shows how the corresponding implementation of schema file parser in Go can be done: https://github.com/open-telemetry/opentelemetry-go/pull/2999 This commit in my personal repo shows in more detail how the "split" transformation can be implemented: https://github.com/tigrannajaryan/telemetry-schema/commit/2566b448fc582f5e620653b3aeaf210c3cb66fd8 --- CHANGELOG.md | 3 + ...format_v1.0.0.md => file_format_v1.1.0.md} | 64 +++++++++++++++---- specification/schemas/overview.md | 6 +- specification/versioning-and-stability.md | 2 +- 4 files changed, 58 insertions(+), 17 deletions(-) rename specification/schemas/{file_format_v1.0.0.md => file_format_v1.1.0.md} (89%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 7752515b01b..2d46d2fcb13 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -56,6 +56,9 @@ release. ### Telemetry Schemas +- Introduce "split" metric schema transformation + ([#2653](https://github.com/open-telemetry/opentelemetry-specification/pull/2653)). + ### Common - Introduce Instrumentation Scope Attributes diff --git a/specification/schemas/file_format_v1.0.0.md b/specification/schemas/file_format_v1.1.0.md similarity index 89% rename from specification/schemas/file_format_v1.0.0.md rename to specification/schemas/file_format_v1.1.0.md index c012886b364..7a58b7195e8 100644 --- a/specification/schemas/file_format_v1.0.0.md +++ b/specification/schemas/file_format_v1.1.0.md @@ -12,8 +12,8 @@ this schema version. Here is the structure of the Schema File: ```yaml -# Defines the file format. MUST be set to 1.0.0. -file_format: 1.0.0 +# Defines the file format. MUST be set to 1.1.0. +file_format: 1.1.0 # The Schema URL that this file is published at. The version number in the URL # MUST match the highest version number in the "versions" section below. @@ -256,6 +256,27 @@ sections. Here is the structure: # specified below. ``` +#### split Transformation + +This transformation splits a metric into several metrics and eliminates an attribute. +Here is the structure: + +```yaml + metrics: + changes: + - split: + # Name of old metric to split. + apply_to_metric: + # Name of attribute in the old metric to use for splitting. The attribute will be + # eliminated, the new metric will not have it. + by_attribute: + # Names of new metrics to create, one for each possible value of attribute. + attributes_to_metrics: + # map of key/values. The keys are the old attribute value used + # in the previous version, the values are the new metric name + # starting from this version. +``` + ### logs Section "logs" section in the schema file defines transformations that are applicable @@ -324,8 +345,8 @@ file. The format version follows the MAJOR.MINOR.PATCH format, similar to semver The "file_format" setting is used by consumers of the file to know if they are capable of interpreting the content of the file. -The current value for this setting is "1.0.0". Any change to this number MUST -follow the OTEP process and be published in the specification. +The current value for this setting is "1.1.0". Any change to this number MUST +be published in the specification. The current schema file format allows representing a limited set of transformations of telemetry data. We anticipate that in the future more types @@ -353,19 +374,22 @@ change according to the following rules: in a backward compatible manner. "Backward compatible" in this context means that consumers that are aware of the new MINOR number can consume the file of a particular MINOR version number or of any MINOR version number lower than - that, provided that MAJOR version numbers are the same. Typically, this means - that the added setting in file format is optional and the default value of the - setting matches the behavior of the previous file format version. + that, provided that MAJOR version numbers are the same. This can happen for + example when: + + - A new transformation type is added. + + - A new setting is added to an existing transformation. The new setting is optional + and the default value of the setting matches the behavior of the previous file + format version. Note: there is no "forward compatibility" based on MINOR version number. Consumers which support reading up to a particular MINOR version number SHOULD NOT attempt to consume files with higher MINOR version numbers. - MAJOR number SHOULD be increased if the file format is changed in an - incompatible way. For example adding a new transformation type in the - "changes" section is an incompatible change because it cannot be ignored by - existing schema conversion logic, so such a change will require a new MAJOR - number. + incompatible way. This means the consumers of the file need to parse or interpret + the file differently compared to previous MAJOR version. Correspondingly: @@ -426,8 +450,8 @@ To illustrate this with some examples: ## Appendix A. Example Schema File ```yaml -# Defines the file format. MUST be set to 1.0.0. -file_format: 1.0.0 +# Defines the file format. MUST be set to 1.1.0. +file_format: 1.1.0 # The Schema URL that this file is published at. The version number in the URL # MUST match the highest version number in the "versions" section below. @@ -527,6 +551,20 @@ versions: - system.memory.utilization - system.paging.usage + - split: + # Example from the change done by https://github.com/open-telemetry/opentelemetry-specification/pull/2617 + # Name of old metric to split. + apply_to_metric: system.paging.operations + # Name of attribute in the old metric to use for splitting. The attribute will be + # eliminated, the new metric will not have it. + by_attribute: direction + # Names of new metrics to create, one for each possible value of attribute. + attributes_to_metrics: + # If "direction" attribute equals "in" create a new metric called "system.paging.operations.in". + in: system.paging.operations.in + # If "direction" attribute equals "out" create a new metric called "system.paging.operations.out". + out: system.paging.operations.out + logs: # Definitions that apply to LogRecord data type. changes: diff --git a/specification/schemas/overview.md b/specification/schemas/overview.md index 7ec3bebdf08..9896894889a 100644 --- a/specification/schemas/overview.md +++ b/specification/schemas/overview.md @@ -74,7 +74,7 @@ work together. Telemetry Schemas are central to how we make this possible. Here is a summary of how the schemas work: -- OpenTelemetry defines a [file format](file_format_v1.0.0.md) for defining +- OpenTelemetry defines a [file format](file_format_v1.1.0.md) for defining telemetry schemas. - Telemetry schemas are versioned. Over time the schema may evolve and telemetry @@ -182,7 +182,7 @@ passes through the Collector is converted to that target schema: ## Schema URL Schema URL is an identifier of a Schema. The URL specifies a location of a -[Schema File](file_format_v1.0.0.md) that can be retrieved (so it is a URL and +[Schema File](file_format_v1.1.0.md) that can be retrieved (so it is a URL and not just a URI) using HTTP or HTTPS protocol. Fetching the specified URL may return an HTTP redirect status code. The fetcher @@ -215,7 +215,7 @@ Version number follows the MAJOR.MINOR.PATCH format, similar to semver 2.0. Version numbers use the [ordering rules](https://semver.org/#spec-item-11) defined by semver 2.0 specification. See how ordering is used in the -[Order of Transformations](file_format_v1.0.0.md#order-of-transformations). Other than the ordering +[Order of Transformations](file_format_v1.1.0.md#order-of-transformations). Other than the ordering rules the schema version numbers do not carry any other semantic meaning. OpenTelemetry schema version numbers match OpenTelemetry specification version diff --git a/specification/versioning-and-stability.md b/specification/versioning-and-stability.md index 3a039c77f1c..a26c2ec3b5d 100644 --- a/specification/versioning-and-stability.md +++ b/specification/versioning-and-stability.md @@ -170,7 +170,7 @@ currently described and are allowed: - Renaming of span events. All such changes MUST be described in the OpenTelemetry -[Schema File Format](schemas/file_format_v1.0.0.md) and published in this repository. +[Schema File Format](schemas/file_format_v1.1.0.md) and published in this repository. For details see [how OpenTelemetry Schemas are published](schemas/overview.md#opentelemetry-schema). See the [Telemetry Stability](telemetry-stability.md) document for details on how