-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional vs required attributes in HTTP conventions and beyond #2114
Comments
I definitely believe instrumentation should be able to fill all conventions as proposed here - instrumentation needs to provide that baseline which can be then trimmed down to needed bits through configuration or similar. I don't know if there's much point in having classifiers like required and optional at that point - I can imagine backends providing a recommended semantic configuration, and users may customize this further. I don't think either of these would really be interested in whether it's labeled required here, they have much more knowledge of what conventions are useful with that backend, or in that user app, then a spec author. Some reasons I can think of for excluding attributes
These seem quite diverse and there are probably more. If OTel instrumentation is able to fill all conventions in order to provide a baseline, I'm not sure how to go further to navigate these waters and make any recommendations on what is required or not. |
These are very interesting points I hadn't really thought of, and I agree with @anuraaga's comments around applicability of required/optional delineation for backends. However, I can see the benefit for some level of "required" information to enforce that all backends are able to understand those attributes and their meaning. Granted, it doesn't guarantee a backend actually utilizes the attribute in visualizations, or other ways. For me, it comes down to whether it is a goal of OpenTelemetry to define a "required" set of attributes that all vendor backends need to understand/utilize? Or is it sufficient to define the possible set of attributes and their meaning? If defining required attributes is not a goal, differentiating between attributes in required and optional terms likely does not add much value, and it can be left up to vendors and organizations as to what their needs are. |
OpenTelemetry doesn't define any requirements for vendor backends, but it does for instrumentation. By doing that, it should give vendor backends a baseline they can rely on, but it's completely up to vendor backends which attributes they utilize.
But "truly optional" attributes would still be defined in the semantic conventions? |
I guess we're talking about similar things, is this a fair summary?
And we need to pick a default level for instrumentation: 1 or 1+2. And then instrumentation allows to enable 2 or disable it depending on default we choose. 3 is always opt-in if defined. On the backend side, backend is always free to use or drop attribute, I don't see any need to spec backend behavior here. |
Given vendors can pick and choose which attributes they utilize, we should look to have the smallest set of required attributes that aligns with all vendor backends. |
Why can't the backend functionality gracefully degrade when certain attributes are not present? Can someone give an example of an attribute whose absence would be catastrophic/non-recoverable for the backend? |
It can, the problem is that there is no path to a great experience when certain instrumentation chooses not to expose optional attributes. E.g. |
assuming we pick the bare minimum, what do we do with other useful attributes? Do we require instrumentation to add them behind configuration? add depending on the author's choice? not add at all? I'm trying to create some clarity on this particular part and find defaults. |
As you suggested previously, it feels like there are really 3 levels we need. The first is the bare minimum required attributes that most/all vendor backends will support, second is a group of attributes the community believes are beneficial but backends may not recognize, and then all other attributes. I can see 2 and 3 being behind instrumentation flags, but 2 is enabled by default as the community sees benefit in those attributes |
I think currently the definition of a backend only involves OTLP It would be a big addition to add attribute processing too as something a backend must do to support OTel. I'm not sure there is much benefit to it though - for example, even saying it's ok to not be any visualization as long as the attribute is supported. Does that mean it's actually supported? I would expect then recommendations getting even more technical - all required attributes must be indexed and usable as filters in queries, for example. If going that far then I think the classifications do then provide action items for backends. But I guess OTel is not generally meant to spec out backends in such detail. Otherwise I'm not sure what actually gets gained by having classifications vs not. It seems to only make certain attributes opt-in and others opt-out, but that seems to make the experience less intuitive, especially if some backend semantic config will just get copied in anyways. |
What is the path to great experience if the application chooses to disable tracing altogether? Disabling tracing is just an extreme case of not sending all useful attributes, i.e. verbosity level 0. The more data they send, the better/richer experience they get. I think any backend would still have to deal with apps that simply cannot use common instrumentation (maybe because they run on proprietary frameworks), and may not send the data that is 100% according to the spec. And again, the closer they are to the spec, the better service they get. |
If user disables tracing, it's user's choice. using uncommon instrumentation or writing a custom one is also a user choice. |
Now I am confused :) Is this issue about providing configuration mechanism or about marking some attributes as required? I think that
I understand it that you want to mark And this is my general feedback: for me, |
I feel the second point is more important though, and applies to tracing as well. I suspect most people that have used tracing know how important route is to be there when searching. I would call instrumentation that doesn't populate it to be "low quality instrumentation". It is fine for this to exist, instrumentation can only go as far as the framework being instrumented. It's why we generally use higher level frameworks, and why they naturally are easier to instrument. I don't feel the words required and optional capture this well. It's why I actually hope the distinction can just disappear, it doesn't seem to be helping anyone in practice. |
I agree that So I'd change it to the following categories:
since provide always is not really different from provide if available, we can merge those two for simplicity into provide if available |
What would be the expected result for a provide always attribute if it's not there? Is it an error, or it falls back to a provide if available? My one concern with merging provide always and provide if available is making attributes which should be there, less likely to be there as there is no longer an always requirement. |
I think @lmolkova's intention of merging those is to come with something like always provide if available, or must provide if available. |
Thanks @pyohannes, that helped my mental model. I like the addition of always to indicate the required nature of the attribute. |
@kenfinnigan I think users need to be in control of their data (always 😂 ) - I can't imagine having any attribute that is not allowed to be disabled somehow, that seems to take too much control away. I am a fan of merging into provide if available personally. |
I see your point. I've been coming at it from the perspective of a minimum set of information provided to backends, but I appreciate OTeL doesn't control that aspect |
it sounds like we have some preliminary consensus here, I'll send a PR to HTTP spec as a start. Once we have an agreement there, we can generalize it beyond HTTP (and it would require some tooling change around yml). |
I'm making a revision of what OTel instrumentations populate (as a step to understand how to clarify what's required and possible) around sets of client and server attributes. .NET
Java
Go
Python
JS
Several observations:
Proposal:UPDATE: taking into account @trask's awesome suggestion on metrics needs (below) Client: the only set of attributes clients should support is:
Server: the only set of attributes servers should support is:
|
I really like this proposal 👍 👍 Only two comments:
These are important http client metric attributes. Our current thinking in Java instrumentation is that metric attributes will always be a subset of span attributes (on operations where we capture both spans and metrics).
Similarly, this is an important http server metric attribute. |
thank you @trask! Agree, and added your suggestion. Will check existing instrumentation for |
100% on this! Very useful for folks wanting to generate metrics from spans. |
Based on instrumentation SIG meeting discussion
Problem
Optional attributes in semantic conventions leave the decision to populate them on the instrumentation library.
However, this decision should be in the user/backend/distro hands: this is where you know what information you'd like to see and pay for.
So one library that doesn't populate e.g.
http.request_content_length
(let's put aside the need to have it on spans) makes the decision for all backends and users of this library.Proposal
TL;DR: Instrumentation should collect all useful attributes. When some attributes are truly optional (i.e. rarely needed), there should be a configuration option for them. This issue does not try to address configuration.
http.scheme
if it has no use for it. Nothing new here.What does it mean for HTTP (example)
This is a general semantic convention discussion, but if we go through this exercise for HTTP, here's what I suggest.
All of the below is just an example of this approach, we can discuss each of those suggestions separately from this issue
http.url
for clientshttp.scheme
,net.host.name
,net.host.port
,http.target
for servershttp.host
net.host.name
andhttp.server_name
http.method
http.status_code
http.route
- I assume if your web framework supports it, you'd want ithttp.server_name
- it seems it's supported by a few instrumentations and if it is supported, it's important and generally-usefulFollowing attributes should be collected from headers through header collection configuration - we should not have two different attributes for each of them. they are optional in a way that user/distro enables them
http.host
http.user_agent
http.request_content_length
http.response_content_length
http.client_ip
- is there a way to populate it which is notX-Forwarded-For
?it leaves us with probably not-so-commonly-used attributes which I cautiously suggest removing unless there is some important scenario behind them
http.flavor
does anyone really need it?http.request_content_length_uncompressed
- it's not always HTTP client responsibility to decompress or to await for decompression to finish - the instrumentation is probably feasible in a subset of cases - should we keep it?http.response_content_length_uncompressed
As a result, we'd have common attributes every instrumentation should populate, headers behind configuration and group 4 (which I hope we can remove)
The text was updated successfully, but these errors were encountered: