Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[datadogexporter] Send host metadata #1351

Merged
merged 21 commits into from
Oct 28, 2020

Conversation

mx-psi
Copy link
Member

@mx-psi mx-psi commented Oct 22, 2020

Description:

  • Get hostname and instance ID from EC2 instance metadata if available.

  • When either the metrics or traces exporter in the Datadog exporter are started, a single goroutine is created that sends host metadata (currently comprised of: EC2 instance metadata, tags and OS metadata) to our backend every 30 minutes. This goroutine is stopped at shutdown. This is controlled by a (opt-out) flag.

  • Remove logic adding tags to metrics and traces in favour of the metadata goroutine.

Link to tracking Issue: n/a
Testing: Added some unit tests, amended existing ones, tested on an EC2 instance against Datadog's backend.
Documentation: Documented public functions

@mx-psi mx-psi marked this pull request as ready for review October 22, 2020 16:08
@mx-psi mx-psi requested a review from a team October 22, 2020 16:08
The backend will add these tags since they are sent with metadata
Copy link
Contributor

@ericmustin ericmustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍 , we may need to update the example and test yaml though?

@ericmustin
Copy link
Contributor

(disregard the bit about needing to update the yaml, lgtm)

Reusing the `GetTags` function for this since this PR leaves it without use
Copy link
Contributor

@KSerrania KSerrania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a few nits

exporter/datadogexporter/go.mod Outdated Show resolved Hide resolved
exporter/datadogexporter/metadata/ec2/ec2.go Outdated Show resolved Hide resolved
exporter/datadogexporter/metadata/metadata.go Outdated Show resolved Hide resolved
exporter/datadogexporter/metadata/metadata.go Outdated Show resolved Hide resolved
exporter/datadogexporter/metadata/metadata.go Outdated Show resolved Hide resolved
Fix documentation comments

Co-authored-by: Kylian Serrania <kylian.serrania@datadoghq.com>
@mx-psi
Copy link
Member Author

mx-psi commented Oct 23, 2020

This is now reviewable by maintainers (the previous two reviewers are from Datadog).

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not a duplicate of the https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/master/processor/resourcedetectionprocessor? why do we have duplicate functionality? Can you use that?

@mx-psi
Copy link
Member Author

mx-psi commented Oct 26, 2020

Is this not a duplicate of the https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/master/processor/resourcedetectionprocessor? why do we have duplicate functionality? Can you use that?

@bogdandrutu No: there is some overlap on the information this PR and the resource detector gather, but at least some goals are different. This PR aims to

  1. send metadata to Datadog that includes: a) all hostnames of the host (including, if available, EC2 hostname and instance id), b) some user-set metadata about the host and c) some info about the version and kind of component and
  2. set the hostname in metrics and traces to an EC2 one if available, if they don't have a hostname already.

We use metadata from 1.a to make sure that host names that refer to the same host are identified as such, since this impacts usability and billing. This is not covered by the resource detection processor: it sends data to Datadog and it is not attached to metrics/traces. 1.b and 1.c are not related to EC2 in any way.

Aim (2) could be covered by the resource detection processor (and I want to open a PR to make use of the metadata set by that component in the future), but, since we had the EC2 information available from (1), I thought it made sense to add it here too: for usability we must send telemetry data with some hostname so this would be a last resort mechanism to get a coherent hostname.

@bogdandrutu
Copy link
Member

@mx-psi please at least consider to create an issue, to understand how can we remove duplicate code, and how can we share the functionality at least between the processor and this exporter.

Once the issue is created and assign to you :) I will merge this :)

@ericmustin
Copy link
Contributor

@bogdandrutu one thing to note here is that, even if a processor could do all of these things, the issue is that exporters can not create/control processors (or other components) on their own. It requires user configuration. And in general asking users to setup multiple pieces of a pipeline increases the room for error/mistakes.

Something to keep in mind as I think i've seen a few use cases now where it would be nice to have an option that would allow exporters to have more control over the whole pipeline.

@bogdandrutu
Copy link
Member

@ericmustin good observation, that is one of the reason some companies decided to ship their own distribution of the collector, which is in a sense just the collector with a builtin configuration. We can discuss about that as well I think.

@jrcamp
Copy link
Contributor

jrcamp commented Oct 26, 2020

@bogdandrutu one thing to note here is that, even if a processor could do all of these things, the issue is that exporters can not create/control processors (or other components) on their own. It requires user configuration. And in general asking users to setup multiple pieces of a pipeline increases the room for error/mistakes.

Something to keep in mind as I think i've seen a few use cases now where it would be nice to have an option that would allow exporters to have more control over the whole pipeline.

Is this about detecting metadata for ec2, gcp, etc.? There is an OTEP that has been accepted for this (https://github.com/open-telemetry/oteps/blob/master/text/0111-auto-resource-detection.md). I think the long term plan would be to implement this which the resourcedetectionprocessor could use as well as other components that needed it. Perhaps there could be an extension that you configure once that exporters and processors could centrally source the metadata from.

@james-bebbington any progress on the autodetection API?

@james-bebbington
Copy link
Member

@james-bebbington any progress on the autodetection API?

The spec for that was merged here: open-telemetry/opentelemetry-specification#811

But in the end I don't think there's too much value using that in the collector. We may as well continue with the resource detection processor as is imo.

@mx-psi
Copy link
Member Author

mx-psi commented Oct 27, 2020

@mx-psi please at least consider to create an issue, to understand how can we remove duplicate code, and how can we share the functionality at least between the processor and this exporter

@bogdandrutu I created issue #1379 to track this overlapping functionality; let me know if I should expand/modify this in any way. I can't assign it to me but feel free to do so when you read this.

@mx-psi
Copy link
Member Author

mx-psi commented Oct 27, 2020

CI keeps failing because of a flaky load test and a flaky unit test (the latter I believe is being fixed on #1384). I am going to try a couple times more to see if I can get them to pass, sorry for the noise. If I can't, rerunning the individual tests from the CircleCI UI would be an option (but I can't do that since I don't have write access to this repo)

@codecov
Copy link

codecov bot commented Oct 27, 2020

Codecov Report

Merging #1351 into master will decrease coverage by 0.45%.
The diff coverage is 31.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1351      +/-   ##
==========================================
- Coverage   89.86%   89.41%   -0.46%     
==========================================
  Files         326      328       +2     
  Lines       16061    16154      +93     
==========================================
+ Hits        14434    14444      +10     
- Misses       1182     1263      +81     
- Partials      445      447       +2     
Flag Coverage Δ
#integration 68.89% <ø> (+<0.01%) ⬆️
#unit 88.66% <31.45%> (-0.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
exporter/datadogexporter/metadata/metadata.go 0.00% <0.00%> (ø)
exporter/datadogexporter/metrics/utils.go 100.00% <ø> (ø)
exporter/datadogexporter/metadata/ec2/ec2.go 25.00% <25.00%> (ø)
exporter/datadogexporter/factory.go 69.01% <39.13%> (-15.61%) ⬇️
exporter/datadogexporter/config/config.go 65.00% <75.00%> (-11.93%) ⬇️
exporter/datadogexporter/metadata/host.go 64.70% <100.00%> (+7.56%) ⬆️
exporter/datadogexporter/metrics_exporter.go 60.00% <100.00%> (ø)
exporter/datadogexporter/trace_connection.go 59.64% <100.00%> (+0.72%) ⬆️
exporter/datadogexporter/traces_exporter.go 78.12% <100.00%> (-1.29%) ⬇️
exporter/datadogexporter/translate_traces.go 82.81% <100.00%> (+0.22%) ⬆️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7268b25...948b493. Read the comment docs.

@mx-psi
Copy link
Member Author

mx-psi commented Oct 27, 2020

@bogdandrutu tests have passed now, please let me know if there is anything else that should be addressed

(edit: I improved test coverage a bit given the Codecov report)

@bogdandrutu bogdandrutu merged commit 9cec49b into open-telemetry:master Oct 28, 2020
@mx-psi mx-psi deleted the ec2-metadata branch November 9, 2020 09:35
dyladan referenced this pull request in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
codeboten pushed a commit that referenced this pull request Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants