From dc8faca5e43bf8d8025aadf6dd86798ce2be069f Mon Sep 17 00:00:00 2001 From: Tigran Najaryan Date: Thu, 13 Oct 2022 12:42:16 -0400 Subject: [PATCH] Rename Events to Categorized Logs This is an alternate to https://github.com/open-telemetry/opentelemetry-specification/pull/2863 ## Problem "Event" is a confusing term that is understood differently by different people in different contexts. We would like to avoid using it if possible. ## Proposal The OpenTelemetry Event is defined as a LogRecord that has specific attributes, namely event.name and event.domain. The event.domain here is one of the important elements. It places the event definitions into isolated buckets (domains). We could use the term "Domenized Logs", but it sounds a bit weird, so I want to suggest renaming the concept of "domain" to "category", without changing the semantics. The specially shaped Log Records can be called "Categorized Logs" and have attributes "log.category" (previously known as "event.domain") and "log.name" (previously known as "event.name"). We will refrain from using the term "event" as much as possible to avoid confusion. I am open to other name suggestions for "Categorized Logs". We just want to make sure to avoid the word "events" and avoid inventing completely new terms, so some sort of adjective + "logs" seems to be the best approach. ## What Changes? - "Event" is renamed to "Categorized LogRecord" - "event.name" is renamed to "log.name" - "event.domain" is renamed to "log.category" - "log.category" is an attribute of LogRecord (instead of previously "event.domain" being a Scope attribute) ## What Did We Lose? "event.domain" previously could be recorded as a Scope attribute and be used for efficient batch processing/routing of logrecords. This is no longer possible, but we can add another such attribute in the future that serves the same purpose (e.g. see the [proposal to add "signal.type"](https://github.com/open-telemetry/opentelemetry-specification/pull/2863/files#r993794504)). --- specification/logs/README.md | 100 +++++++++++++++--- specification/logs/api.md | 47 ++++---- .../semantic_conventions/categorizedlogs.md | 27 +++++ .../logs/semantic_conventions/events.md | 27 ----- 4 files changed, 137 insertions(+), 64 deletions(-) create mode 100644 specification/logs/semantic_conventions/categorizedlogs.md delete mode 100644 specification/logs/semantic_conventions/events.md diff --git a/specification/logs/README.md b/specification/logs/README.md index 8a24caa0e90..a0436b99cff 100644 --- a/specification/logs/README.md +++ b/specification/logs/README.md @@ -17,6 +17,8 @@ aliases: [/docs/reference/specification/logs/overview] - [OpenTelemetry Solution](#opentelemetry-solution) - [Log Correlation](#log-correlation) - [Events and Logs](#events-and-logs) + * [Categorized LogsRecords](#categorized-logsrecords) + * [FAQ](#faq) - [Legacy and Modern Log Sources](#legacy-and-modern-log-sources) * [System Logs](#system-logs) * [Infrastructure Logs](#infrastructure-logs) @@ -124,11 +126,6 @@ languages have established standards for using particular logging libraries. For example in Java world there are several highly popular and widely used logging libraries, such as Log4j or Logback. -OpenTelemetry defines [events](#events-and-logs) as a type of LogRecord with -specific characteristics. This definition is not ubiquitous across existing -libraries and languages. In some logging libraries, producing events aligned -with the OpenTelemetry event definition is clunky or error-prone. - There are also countless existing prebuilt applications or systems that emit logs in certain formats. Operators of such applications have no or limited control on how the logs are emitted. OpenTelemetry needs to support these logs. @@ -148,6 +145,12 @@ Given the above state of the logging space we took the following approach: OpenTelemetry log data model. OpenTelemetry Collector can read such logs and translate them to OpenTelemetry log data model. +- OpenTelemetry defines [Categorized Logs](#events-and-logs) as a type of LogRecord with + specific characteristics: + - They have a LogRecord attribute `event.name` (and possibly other LogRecord attributes). + - They have an InstrumentationScope with a non-empty `Name` and with an + InstrumentationScope attribute `event.domain` (and possibly other InstrumentationScope attributes). + - OpenTelemetry defines an API for [emitting LogRecords](./api.md#emit-logrecord). Application developers are NOT encouraged to call this API directly. It is provided for library authors @@ -157,7 +160,7 @@ Given the above state of the logging space we took the following approach: features than what is defined in OpenTelemetry. It is NOT a goal of OpenTelemetry to ship a feature-rich logging library. -- OpenTelemetry defines an API for [emitting Events](./api.md#emit-event). +- OpenTelemetry defines an API for [emitting Categorized Logs](./api.md#emit-event). Application developers are encouraged to call this API directly. - OpenTelemetry defines an [SDK](./sdk.md) implementation of the [API](./api.md), @@ -208,15 +211,84 @@ Wikipedia’s [definition of log file](https://en.wikipedia.org/wiki/Log_file): >In computing, a log file is a file that records either events that occur in an >operating system or other software runs. -From OpenTelemetry's perspective LogRecords and Events are both represented -using the same [data model](./data-model.md). +From OpenTelemetry's perspective logs and events conceptually are not different. Both +are represented using the same [LogRecord data model](./data-model.md). + +### Categorized LogsRecords + +OpenTelemetry defines **Categorized LogRecords** as LogRecords that are shaped +in a special way: + +- They have a LogRecord attribute `log.name` (and possibly other LogRecord attributes). +- They have an InstrumentationScope with a non-empty `Name` and with an + InstrumentationScope attribute `log.category` (and possibly other InstrumentationScope attributes). + +Within a particular `log.category`, the `log.name` uniquely defines a particular class +or type of Categorized LogRecords. Categorized LogRecords with the same `log.category` / +`log.name` follow the same schema which assists in analysis in observability platforms. +See also OpenTelemetry Log [semantic conventions](./semantic_conventions/events.md). + +### FAQ + +**What is a Categorized LogRecord?** + +It is a specially shaped LogRecord. See [Categorized LogRecords](#categorized-logsrecords). + +**How are events and logs different?** + +They are not. The words "events" and "logs" are synonyms. We prefer the word "logs" +when referring to generic log and event data. + +**Who produces Categorized LogRecords?** + +Categorized LogRecord are produced using OpenTelemetry Categorized Logs API or +by OpenTelemetry Collector. + +**Why do Categorized LogRecords exist as a concept?** + +Categorized LogRecords are a class of logs designed within OpenTelemetry community +or in compliance with OpenTelemetry recommendations. Categorized LogRecords have a +particular shape of data that OpenTelemetry believes is beneficial for designers of +structured logs and events to adopt. + +**What are the reasons Categorized LogRecords have an `log.category` attribute?** + +The `log.category` Scope attribute isolates groups (categorizes) of logs or events +designed by different people. Any decisions about the choice of attribute names and other +decisions about the shape of the LogRecord made by designers of logs in a particular +domain have no impact on the design of logs in another domain. +In other words, the `log.category` attribute allows different groups of people to +independently make choices about log representation in their domain of expertise +without worrying that their choices will impact people who design logs +in some other domain of expertise. + +**I have a non-OpenTelemetry data source that produces logs/events (e.g. Windows Events). +Should I make sure they are shaped like Categorized LogRecords when used with OpenTelemetry +software (e.g. inside OpenTelemetry Collector)?** + +Not necessarily. Only do so if the semantics of the non-OpenTelemetry data source +match the definition of Categorized LogRecords. + +**I have non-OpenTelemetry data source that produces events that have a `name` and +`category`. The semantics of the `name` and `category` in this data source are exactly the +same as `log.name` and `log.category` at OpenTelemetry. What should I do when I bring +these events to OpenTelemetry?** + +If there is an exact match in the semantics then it is reasonable to map them to +OpenTelemetry's concepts. So, when the events from the external data source are converted +to OpenTelemetry LogRecords (for example in OpenTelemetry Collector) it is reasonable +to shape them like Categorized Logs. In the given example it is reasonable to map +the `name` field from the data source to `log.name` and the `category` field to +`log.category`. + +**I am designing a new library/application/system and want to produce structured logs/events +using OpenTelemetry. Should my events be shaped like Categorize LogRecords?** -However, OpenTelemetry does recognize a subtle semantic difference between -LogRecords and Events: Events are LogRecords which have a `name` and `domain`. -Within a particular `domain`, the `name` uniquely defines a particular class or -type of event. Events with the same `domain` / `name` follow the same schema -which assists in analysis in observability platforms. Events are described in -more detail in the [semantic conventions](./semantic_conventions/events.md). +Yes. For new designs we recommend to shape your data like Categorize LogRecords. +Make sure to choose a good descriptive value for `log.category`. If the domain is common +enough consider adding it as a well-known domain name in OpenTelemetry's [semantic conventions]( +https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/semantic_conventions/events.md) +for `log.category` attribute. ## Legacy and Modern Log Sources diff --git a/specification/logs/api.md b/specification/logs/api.md index a3cde5989ba..ac8c54010d0 100644 --- a/specification/logs/api.md +++ b/specification/logs/api.md @@ -1,4 +1,4 @@ -# Events and Logs API Interface +# Logs API Interface **Status**: [Experimental](../document-status.md) @@ -14,7 +14,7 @@ + [Get a Logger](#get-a-logger) - [Logger](#logger) * [Logger operations](#logger-operations) - + [Emit Event](#emit-event) + + [Emit Categorized LogRecord](#emit-categorized-logrecord) + [Emit LogRecord](#emit-logrecord) - [LogRecord](#logrecord) - [Usage](#usage) @@ -26,19 +26,19 @@ -The Events and Logs API consist of these main classes: +The Logs API consist of these main classes: * LoggerProvider is the entry point of the API. It provides access to Loggers. * Logger is the class responsible for - creating [Events](./semantic_conventions/events.md) - and [Logs](./data-model.md#log-and-event-record-definition) as LogRecords. + creating [arbitrary LogRecords](#emit-logrecord) or + [Categorized Logs](#emit-categorized-logrecord). LoggerProvider/Logger are analogous to TracerProvider/Tracer. ```mermaid graph TD A[LoggerProvider] -->|Get| B(Logger) - B --> C(Event) + B --> C(Categorized Log) B --> D(Log) ``` @@ -91,10 +91,10 @@ produced by this library. the scope has a version (e.g. a library version). Example value: 1.0.0. - `schema_url` (optional): Specifies the Schema URL that should be recorded in the emitted telemetry. -- `event_domain` (optional): Specifies the domain for the Events emitted, which -should be added as `event.domain` attribute of the instrumentation scope. +- `log_category` (optional): Specifies the category for the logs emitted, which +should be added as `log.category` attribute of the LogRecords. - `include_trace_context` (optional): Specifies whether the Trace Context should -automatically be passed on to the Events and Logs emitted by the Logger. This +automatically be passed on to the Logs emitted by the Logger. This SHOULD be true by default. - `attributes` (optional): Specifies the instrumentation scope attributes to associate with emitted telemetry. @@ -110,7 +110,7 @@ identifying fields are equal. The term *distinct* applied to Loggers describes instances where at least one identifying field has a different value. Implementations MUST NOT require users to repeatedly obtain a Logger again with -the same name+version+schema_url+event_domain+include_trace_context+attributes +the same name+version+schema_url+log_category+include_trace_context+attributes to pick up configuration changes. This can be achieved either by allowing to work with an outdated configuration or by ensuring that new configuration applies also to previously returned Loggers. @@ -119,7 +119,7 @@ Note: This could, for example, be implemented by storing any mutable configuration in the `LoggerProvider` and having `Logger` implementation objects have a reference to the `LoggerProvider` from which they were obtained. If configuration must be stored per-Logger (such as disabling a certain `Logger`), -the `Logger` could, for example, do a look-up with its name+version+schema_url+event_domain+include_trace_context+attributes +the `Logger` could, for example, do a look-up with its name+version+schema_url+log_category+include_trace_context+attributes in a map in the `LoggerProvider`, or the `LoggerProvider` could maintain a registry of all returned `Logger`s and actively update their configuration if it changes. @@ -129,7 +129,7 @@ the emitted data format is capable of representing such association. ## Logger -The `Logger` is responsible for emitting Events and Logs. +The `Logger` is responsible for emitting Logs. Note that `Logger`s should not be responsible for configuration. This should be the responsibility of the `LoggerProvider` instead. @@ -138,22 +138,22 @@ the responsibility of the `LoggerProvider` instead. The Logger MUST provide functions to: -#### Emit Event +#### Emit Categorized LogRecord -Emit a `LogRecord` representing an Event to the processing pipeline. +Emit a `LogRecord` representing a Categorized LogRecord to the processing pipeline. -This function MAY be named `logEvent`. +This function MAY be named `logCategorized`. **Parameters:** -* `name` - the Event name. This argument MUST be recorded as a `LogRecord` - attribute with the key `event.name`. Care MUST be taken by the implementation - to not override or delete this attribute while the Event is emitted to +* `name` - the log name. This argument MUST be recorded as a `LogRecord` + attribute with the key `log.name`. Care MUST be taken by the implementation + to not override or delete this attribute while the log is emitted to preserve its identity. -* `logRecord` - the [LogRecord](#logrecord) representing the Event. +* `logRecord` - the [LogRecord](#logrecord) representing the log. -Events require the `event.domain` attribute. The API MUST not allow creating an -Event if the Logger instance doesn't have `event.domain` scope attribute. +Categorize Logs require the `log.category` attribute. The API MUST not allow creating a +Categorize Log if the Logger instance doesn't have `log.category` attribute. #### Emit LogRecord @@ -171,8 +171,9 @@ by end users or other instrumentation. ## LogRecord -The API emits [Events](#emit-event) and [LogRecords](#emit-logrecord) using -the `LogRecord` [data model](data-model.md). +The API emits [arbitrary LogRecords](#emit-logrecord) or +[Categorized LogRecords](#emit-categorized-logrecord) using the `LogRecord` +[data model](data-model.md). A function receiving this as an argument MUST be able to set the following fields: diff --git a/specification/logs/semantic_conventions/categorizedlogs.md b/specification/logs/semantic_conventions/categorizedlogs.md new file mode 100644 index 00000000000..327f7e14f6b --- /dev/null +++ b/specification/logs/semantic_conventions/categorizedlogs.md @@ -0,0 +1,27 @@ +# Semantic Convention for Categorize Logs + +**Status**: [Experimental](../../document-status.md) + +This document describes the attributes of Categorized Logs that are represented +by `LogRecord`s. All Categorized Logs have a name and a category. The category +is a namespace for names and is used as a mechanism to avoid conflicts of +names. + + +| Attribute | Type | Description | Examples | Requirement Level | +|-----------------------|---|------------------------------------------------------------------------------------------------------------------------------|---|---| +| `log.name` | string | The name identifies the log type. | `click`; `exception` | Required | +| `log.category` | string | The category identifies the context in which the log is defined. An log name is unique only within a cagtegory. [1] | `browser` | Required | + +**[1]:** An `log.name` is supposed to be unique only in the context of an +`log.category`, so this allows for two logs in different categories to +have same `log.name`, yet be unrelated logs. + +`log.category` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used. + +| Value | Description | +|---|--------------------------| +| `browser` | Events from browser apps | +| `device` | Events from mobile apps | +| `k8s` | Events from Kubernetes | + \ No newline at end of file diff --git a/specification/logs/semantic_conventions/events.md b/specification/logs/semantic_conventions/events.md deleted file mode 100644 index 16c132923a4..00000000000 --- a/specification/logs/semantic_conventions/events.md +++ /dev/null @@ -1,27 +0,0 @@ -# Semantic Convention for event attributes - -**Status**: [Experimental](../../document-status.md) - -This document describes the attributes of standalone Events that are represented -by `LogRecord`s. All standalone Events have a name and a domain. The Event domain -is a namespace for event names and is used as a mechanism to avoid conflicts of -event names. - - -| Attribute | Type | Description | Examples | Requirement Level | -|---|---|---|---|---| -| `event.name` | string | The name identifies the event. | `click`; `exception` | Required | -| `event.domain` | string | The domain identifies the context in which an event happened. An event name is unique only within a domain. [1] | `browser` | Required | - -**[1]:** An `event.name` is supposed to be unique only in the context of an -`event.domain`, so this allows for two events in different domains to -have same `event.name`, yet be unrelated events. - -`event.domain` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used. - -| Value | Description | -|---|---| -| `browser` | Events from browser apps | -| `device` | Events from mobile apps | -| `k8s` | Events from Kubernetes | - \ No newline at end of file