Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics semantic conventions for timed operations #657

Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions specification/metrics/semantic_conventions/metric-general.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# General

The conventions described in this section are generic; they are not specific to a
particular operation.

## Summarizing timed operations

Timed operations that may be more fully described by Spans can be summarized by a single [`ValueRecorder`](../api.md#valuerecorder) recording duration. The instrument name should be prefixed with the Span category and kind.
The resulting name should look like `{category}.{span.kind}.duration`. See [examples](#examples) below.

For each operation being recorded, the duration of the operation **in milliseconds** should be recorded onto the `duration` instrument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that these metrics should be reported in addition to spans and thus duplicate data (since spans already report the operation's start and end timestamp and thus also wall clock duration) or instead of spans?

If in addition, is this meant to provide inexpensive 100% coverage for the duration only when spans would otherwise be sampled?

Either way: Please clarify the intent more clearly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is actually duplicate data, since spans are, in general, sampled, as you point out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent point. As @jkwatson points out, this is not intended to be duplicate data because spans are sampled, and the metrics generated from this instrument would not be.

I had intended (but clearly not actually written) that every recorded span would contribute to the duration metric, regardless of whether it is sampled.
I had tried to avoid entangling this spec too much with Spans, but I'm thinking now that it would be clearer if I made this more explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a configuration question. If the user wants metrics instead of spans, they may turn off span reporting and leave metrics reporting enabled. The spans will still be seen by the SDK for the purpose of generating these metrics, but spans wouldn't be exported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we are drifting into configuration a bit here.
I started writing this up this morning, but I'm struggling with the wording. Here's my thought process:

If the span.isRecorded is true and the user has configured that metrics should be generated for the span's category and kind, then the duration must be recorded onto a ValueRecorder with the name {span.category}.{span.kind}.duration (regardless of the value of span.sampled).

Also, I'm feeling like we need a justification section here, to explain why we'd want metrics that in some cases carry redundant data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a related question, which maybe falls into the scope of this document. In #381 there is the idea that the Tracing SDK should have a configurable way to automatically synthesize metric events corresponding to the span's duration. In that case, I would suggest that the metric labels should be exactly the same as the span attributes, the resource value should be the same (i.e., the corresponding Trace SDK's resource). The instrument kind should be ValueRecorder.

Now, for the controversial part: I think the metric name should equal the span name, which is to say that I think the naming conventions described for metric instruments (open-telemetry/oteps#108) should be adopted for spans as well. This convention addresses the concern about high-cardinality span names that is often discussed in relation to tracing.

(There, I've said the controversial thing I've been meaning to say!)

Copy link
Contributor

@jmacd jmacd Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, questions about "Why generate duplicate events" are missing the point, I think. We can synthesize these metric events in the collector on seeing a span (because we have conventions), and by the time the data is exported into a traditional metric system it is no longer duplicate. We need these conventions because metrics data formats are different from span data formats, it shouldn't imply any form of duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinfoote I think we can close this after adding a sentence somewhere to state "These conventions are meant both to enable automatic metric events from spans as well as for application-level timing that may be considered too brief to report Spans."


#### errors
jmacd marked this conversation as resolved.
Show resolved Hide resolved

The `duration` metric should include an `error` label indicating whether the operation failed. The value of this label should be `true` if the operation ended in a failed state.

### Examples

A simple HTTP service would create the metric:

* `http.server.duration`
justinfoote marked this conversation as resolved.
Show resolved Hide resolved

If this service also calls out to other HTTP services, it would create the additional metric:

* `http.client.duration`

If this service also queries one or more databases, it would create the additional metric:

* `db.client.duration`

All of these metrics would include labels
* `error=true`
* `error=false`