Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dns): Add option to include name as attribute in dns lookup #1178

Closed
wants to merge 9 commits into from

Conversation

evantorrie
Copy link
Contributor

@evantorrie evantorrie commented Sep 15, 2022

Which problem is this PR solving?

fixes #1165

Short description of the changes

Adds an option in the plugin configuration to include "hostname" parameter used in dns.lookup() calls. Defaults to false.

Checklist

  • test-all-versions not applicable to instrumentation/dns.

@evantorrie evantorrie changed the title feat(instrumentation.dns): Add option to include name as attribute in dns lookup feat(dns): Add option to include name as attribute in dns lookup Sep 15, 2022
@codecov
Copy link

codecov bot commented Sep 15, 2022

Codecov Report

Merging #1178 (bfa6cae) into main (180b336) will increase coverage by 0.21%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1178      +/-   ##
==========================================
+ Coverage   96.08%   96.29%   +0.21%     
==========================================
  Files          14       18       +4     
  Lines         893     1079     +186     
  Branches      191      218      +27     
==========================================
+ Hits          858     1039     +181     
- Misses         35       40       +5     
Impacted Files Coverage Δ
...ry-instrumentation-dns/src/enums/AttributeNames.ts 100.00% <100.00%> (ø)
...lemetry-instrumentation-dns/src/instrumentation.ts 96.10% <100.00%> (ø)
...ode/opentelemetry-instrumentation-dns/src/utils.ts 97.95% <100.00%> (ø)
...tapackages/auto-instrumentations-node/src/utils.ts 98.18% <0.00%> (ø)

@evantorrie evantorrie marked this pull request as ready for review September 16, 2022 03:14
@evantorrie evantorrie requested a review from a team September 16, 2022 03:14
Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dyladan - I saw in #488 that you were against capturing this attribute as default.
@evantorrie wrote some arguments in #1165

I think it can be useful to always capture this attribute

assert.strictEqual(spans.length, 1);
assertSpan(span, {
addresses: [],
hostname: undefined,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check will not verify it's omitted, as the code in assertSpan will skip the check:

  if (validations.hostname !== undefined) {
    assert.strictEqual(
      span.attributes[AttributeNames.DNS_HOSTNAME],
      validations.hostname
    );
  }

Maybe when the value sent to assertSpan is undefined we should check that it's not in attributes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@vmarchaud
Copy link
Member

I think it can be useful to always capture this attribute

I agree that we could add it by default, on the same basis as we add it to the http spans however having a option to disable it might be useful too

@Flarna
Copy link
Member

Flarna commented Sep 24, 2022

I agree also that we should add it always.
Not sure if an option to disable it is really needed as I see no privacy issue here nor a data size issue. Also there is no extraction/detection overhead.
It might be duplicated info from parent span but this depends on the parent instrumenation (and sampling decissions).

@rauno56
Copy link
Member

rauno56 commented Sep 30, 2022

I understand that #488 removed it first and foremost because of the invalid attribute and since the spec didn't have a replacement we decided to remove it altogether at the time. That makes sense to me.

This PR adds this back for reasons listed above and in #1165. That also makes sense to me. Like others, I would rather see it added without the configuration option for now.

I'd first like to hear form @dyladan and/or @svrnm to avoid just swinging back and forth between the two implementations.

@svrnm
Copy link
Member

svrnm commented Sep 30, 2022

The argument against having it was that the looked up dns host name is like the request body of an http request and might be considered sensitive data (probably an extreme corner-case but worth noting).

Giving people the choice to have it on or off is probably the middle-ground we are looking for. If it is an opt-in or opt-out is hard to decide.

Here's how I think about it:

@Flarna
Copy link
Member

Flarna commented Sep 30, 2022

if hostnames are seen as sensitive data we should also disable capture of url in e.g. HTTP. But at least HTTP.URL is mandatory in semantic conventions.

@svrnm
Copy link
Member

svrnm commented Oct 4, 2022

if hostnames are seen as sensitive data we should also disable capture of url in e.g. HTTP. But at least HTTP.URL is mandatory in semantic conventions.

fair point

@rauno56
Copy link
Member

rauno56 commented Oct 7, 2022

might be considered sensitive data (probably an extreme corner-case but worth noting)

Can anyone share an example where it would be considered sensitive?

@dyladan
Copy link
Member

dyladan commented Oct 10, 2022

I'd first like to hear form @dyladan and/or @svrnm to avoid just swinging back and forth between the two implementations.

I don't have a strong opinion except that I am in general always hesitant to add attributes for which there is no semantic convention. The whole reason we had to do #488 to begin with was that the value was stored in the wrong attribute (net.peer.name which was not correct) and there was no semconv to give us a clear place to put it (and there still is not). My argument is not against this specific value, but rather all values which do not have specification and may be specified differently in the future resulting in yet more thrash.

@dyladan
Copy link
Member

dyladan commented Oct 10, 2022

The argument against having it was that the looked up dns host name is like the request body of an http request and might be considered sensitive data (probably an extreme corner-case but worth noting).

I didn't think the hostname was going to contain sensitive data but other than that yes this was my original argument. To be honest, it is not really a strongly held belief though.

Giving people the choice to have it on or off is probably the middle-ground we are looking for. If it is an opt-in or opt-out is hard to decide.

Here's how I think about it:

I generally agree with this except that I would argue for settling this in specification first.


Aside: I think this is a symptom of at least 2 much larger problems:

  1. There is no place to put attributes which we know may not be permanent/stable locations, and no place to put custom attributes where we know semconv will never collide with them. I am thinking of something like the X- prefix on experimental HTTP headers, but for attributes (yes, I know the X- convention is now discouraged for good reasons). This would allow custom behavior and experimentation without fear of semconv forcing us to thrash in the future.
  2. There is no standard method of instrumentation configuration. Each instrumentation has its own config structure, flags, and settings, with little to no cohesiveness. By extension, this means there is no conventional way to accept experimental features behind a config flag.

Neither of these problems will be solved today and likely this will require a larger effort than just the OTel JS ecosystem, but as a maintainer I consider these two problems to be two of the biggest we have and likely to hold us back until they are resolved.

@svrnm
Copy link
Member

svrnm commented Oct 11, 2022

I generally agree with this except that I would argue for settling this in specification first.

ACK.

There is no place to put attributes which we know may not be permanent/stable locations, and no place to put custom attributes where we know semconv will never collide with them. I am thinking of something like the X- prefix on experimental HTTP headers, but for attributes (yes, I know the X- convention is now discouraged for good reasons). This would allow custom behavior and experimentation without fear of semconv forcing us to thrash in the future.

I think the main concern of introducing that is, that people implement something put in their x-my-attribute and call it a day. The semantic conventions are for me one of the unique selling points of OpenTelemetry

@evantorrie would you be open to raise an issue/PR with the spec to add DNS attribute(s)

There is no standard method of instrumentation configuration. Each instrumentation has its own config structure, flags, and settings, with little to no cohesiveness. By extension, this means there is no conventional way to accept experimental features behind a config flag.

Yet another reason why open-telemetry/opentelemetry-specification#1773 is so important :-)

@dyladan
Copy link
Member

dyladan commented Oct 11, 2022

There is no place to put attributes which we know may not be permanent/stable locations, and no place to put custom attributes where we know semconv will never collide with them. I am thinking of something like the X- prefix on experimental HTTP headers, but for attributes (yes, I know the X- convention is now discouraged for good reasons). This would allow custom behavior and experimentation without fear of semconv forcing us to thrash in the future.

I think the main concern of introducing that is, that people implement something put in their x-my-attribute and call it a day. The semantic conventions are for me one of the unique selling points of OpenTelemetry

I definitely agree about SemConv, but a place to test potential future behavior is IMO necessary. It is possible that people would just put a custom extension attribute and "call it a day," but it is possible to do that now. At least it would be clear to receivers of the data that it is possibly unstable or unspecified. Right now, because there is no guidance around where to put unspecified data, we end up in exactly the situation this PR is in where a user needs an attribute, that attribute is unspecified, and I can't give them a solution other than "please update the spec."

@blumamir
Copy link
Member

I agree with all the points above.
It looks like we have a consensus that the attribute is safe to record, and should be on the DNS span. There are already other custom attributes used by this instrumentation for which we don't supply a configuration option to turn off recording:

export enum AttributeNames {
  // NOT ON OFFICIAL SPEC
  DNS_ERROR_CODE = 'dns.error_code',
  DNS_ERROR_NAME = 'dns.error_name',
  DNS_ERROR_MESSAGE = 'dns.error_message',
}

I support doing the specification work to add the missing dns attributes, as well as defining experimental attribute namespace/pattern and instrumentation configurations. Since it might take a long time, I see no harm in adding the dns.hostname to the already existing custom attributes and revisiting this entire attributes enum in the future. Isn't this the main reason that instrumentations are marked as experimental and not stable?

The practice of adding useful custom attributes with non-speced names is common in many instrumentations, and once we/otel agree on a strategy we should probably fix it in many places. I don't think we should block the PR

@ethanresnick
Copy link
Contributor

Is this PR now ready to merge? I totally agree with @blumamir's last comment, that this attribute is safe/useful, and that getting a standard attribute name is important but not a blocker.

@github-actions
Copy link
Contributor

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Feb 13, 2023
@blumamir
Copy link
Member

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@evantorrie this PR became stale, but I think it is important and will give value to users. Are you still interested in working on it?

@github-actions github-actions bot removed the stale label Feb 20, 2023
@github-actions
Copy link
Contributor

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Apr 24, 2023
@github-actions
Copy link
Contributor

This PR was closed because it has been stale for 14 days with no activity.

@github-actions github-actions bot closed this May 15, 2023
@clintonb
Copy link

Any interest in picking this up? I may be able to help if folks have reached a consensus on implementation. I was quite surprised to discover the DNS spans don't include the hostname given the purpose of a DNS lookup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider adding option to dns.instrumentation to include hostName as span attribute on lookup
9 participants