Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set host or address as resource for Net::HTTP #278

Closed
wants to merge 2 commits into from

Conversation

buddhistpirate
Copy link

@buddhistpirate buddhistpirate commented Dec 14, 2017

Tracking external requests by host is more useful for understanding performance issues than tracking by HTTP method. By using the host we should have a smaller potential list than if we were tracking by URL/endpoint.

We don't always have a hostname, so lets use logic that exists for falling back to address.

Testing Notes: I didn't add scenarios for testing req.uri.host vs @address

@palazzem palazzem added integrations Involves tracing integrations community Was opened by a community member labels Jan 10, 2018
@delner delner added the feature Involves a product feature label Feb 5, 2018
Copy link

@jeremyolliver jeremyolliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This change looks good, and this will be a big help to us. Currently resources under the net/http are grouped under GET/POST/PUT etc, which makes tracking performance of API's and other systems we don't directly own really problematic

I've left a comment on a potential improvement, but even without this would be really useful to us.

span.resource = req.method
# Using the method as a resource, as URL/path can trigger
# a possibly infinite number of resources.
span.resource = host_address

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be great for the majority of use-cases. It can be common to have multiple endpoints on localhost (e.g. using haproxy to load balance traffic, or multiple services such as elastic search).

Possibly the resource could be set to span.resource = "#{host_address}:#{host_port}" ?

@buddhistpirate
Copy link
Author

@jeremyolliver Great idea! I just added it.

@buddhistpirate
Copy link
Author

Can someone rekick circleCI? It looks like it had some mongo issues. Should I rebase on a particular branch?

@delner
Copy link
Contributor

delner commented Mar 20, 2018

You probably want to double-check you're fully rebased on master, and that appraisal contrib rake test:mongodb tests are passing locally. Mongo should be < 2.5. I think this may have happened months before when Mongo pushed past version 2.5 and broke our CI build. See #318

@buddhistpirate
Copy link
Author

Did the rebase. Looks like I might have caught CircleCI during the AWS S3 issues yesterday. Tests are green locally. I don't have another commit to add to kick a build. Could someone kick it off?

@delner
Copy link
Contributor

delner commented Mar 21, 2018

@buddhistpirate Triggered a rebuild for you.

@delner
Copy link
Contributor

delner commented Mar 22, 2018

@buddhistpirate Sorry to have to prompt you for this again but, I think you're going to have to rebase again on master. The mysql2 failure in CI is because they released mysql2 0.5.0 which broke rails, which broke our build. Fixed this in #382. Should hopefully pass after that.

Tracking external requests by host is more useful for understanding performance issues than tracking by HTTP method. By using the host we should have a smaller potential list than if we were tracking by URL/endpoint.

We don't always have a hostname, so lets use logic that exists for falling back to address.
@buddhistpirate
Copy link
Author

@delner Rebased and looks like the tests are passing! 🚀

@jeffjo
Copy link
Contributor

jeffjo commented May 11, 2018

@delner We also find this to be more useful than using the HTTP verb as the resource. What are your thoughts on updating this for all the HTTP client library integrations like Faraday and Excon?

@jeremy-lq
Copy link
Member

@delner @palazzem What else do we need to get this approved/merged?

@delner
Copy link
Contributor

delner commented May 22, 2018

I think the change we have in this PR is good.

We'd like to improve consistency of this resource naming across all of our clients (in other contribs like Faraday, as well as other libraries like Go/Python), so we're currently discussing what this should look like; this change here is being taken in consideration with that.

I apologize for the delay on getting this merged, but I'll check back in here when I have more to share.

@NobodysNightmare
Copy link

@delner As far as I can see the last version of the gem added a generic URL quantizier (#384).

Is the the consistent way of resource naming you were talking about?

@delner
Copy link
Contributor

delner commented Jun 13, 2018

@NobodysNightmare Partly yes, partly no. Yes in the sense that its meant to make query strings and parts of the URL consistent. No in the sense that its not used by every integration, nor does it yet adhere to any URL standard for the Datadog tracers. That URL standard is still under discussion last I checked, but when that's decided upon, we'll likely update that URL quantizer module to reflect that.

The URL quantization module was created for exactly this purpose; such that every integration has a common means of applying URL quantization in a consistent manner with each other. So we would encourage its use once it fully reflects the standard we'd like to see.

@jeremyolliver
Copy link

@delner What is this blocked on at present? I'm still really hanging out for this feature, and quantization for http resources appears to have landed (or at least there's an implementation in lib/ddtrace/quantization/http.rb on master).

@delner
Copy link
Contributor

delner commented Nov 26, 2018

@jeremyolliver sorry this one is taking a while. The code itself isn't so much of an issue as it is the content of resource.

Last I checked we were blocked on determining some sort of standard for the resource names of HTTP requests like this. We wanted this to be consistent with each integration, and ideally wanted to avoid making a change here now that wouldn't fit with that standard under discussion (because then we'd have to change it again and we didn't want to "pull the rug" out from under users.)

We recently started a discussion internally to determine what this standard should be, so this is under active consideration. One of the bigger challenges here is how to meaningfully group similar HTTP requests so users can get the benefit of statistics in a way that makes sense for their application. (By path? Host/port? Query string? Status code? There are a lot of variables here.)

I think it's somewhat likely that the resource name will end up being more than just this verb (which I'm not happy with either), but I'm awaiting on that consensus for now.

@samandmoore
Copy link

@delner thanks for asking the tough questions and trying to get this into a good spot for now and in the future.

i'm curious, would you take a PR that allows this functionality to be more easily overridden by users of the library in the short-term or maybe permanently if that doesn't cause issues for the product roadmap? e.g. if we could pull the lines that determine the resource name into a method that can be redefined or a configurable lambda.

@delner
Copy link
Contributor

delner commented Nov 29, 2018

@samandmoore For sure, I think that's how I want to deal with this in the near term; make it configurable so users can present the resource as they see fit. There's a more general feature we've implemented in other tracers, called "span hooks", which might help to this end. The idea would be to add callbacks which bind to a specific context to which users can subscribe and modify spans with. This prototype still needs to be fleshed out, and could be a while till its live in a release, but it is on my to-do list.

In the long term, our solution is to simply have better defaults (as we were discussing above), and hopefully reduce (or ideally eliminate) the need for users to implement custom configuration for such things at all. The better defaults will take more time because those require consistency with our other tracers.

@buddhistpirate
Copy link
Author

I'm also a proponent of making it user configurable in the near term. Should I open a new PR for that, or is that something that your team @delner is going to take on? It seems like with the existing configuration system it might be easy enough to do, or at least make it possible via monkey patch/prepend.

@delner
Copy link
Contributor

delner commented Nov 29, 2018

I think that's something I'll probably take on, given I wanted it to become a facet of the new Integration model we've been developing: presumably such a "span hook" feature would simply be a module that can be composed and re-used by any integration, e.g. HTTP but also Rails, Rack, etc... In doing so, hopefully the interface for implementing or configuring a span hook would be very consistent across all integrations.

I would however welcome any suggestions, sample configuration or use-cases I can consider while developing a prototype. If we can develop a vision for how this feature should behave, or what it should allow for, that will help us build a better, more useful prototype. Sample code or other prototypical implementations would be most desirable.

I'll make a case for promoting this issue on our backlog.

@ziru
Copy link

ziru commented Jan 25, 2019

@delner My team recently starts to evaluate datadog apm. One of the pain points during the piloting phase is the resource name set by the http integration, which makes pretty hard to track external services calls.

The above discussion definitely makes sense to me. Is there any update on this issue?

@delner
Copy link
Contributor

delner commented Jan 25, 2019

I feel the pain too; something I definitely want to improve upon. I think this is under broader discussion internally right now, as we want to provide a consistent experience with HTTP libraries across each of the different languages we support. I don't know when we'll have a feature to implement from that, but in the meantime, I'd be happy to consider stop-gap measures that at least make this configurable on the user-side.

@ziru
Copy link

ziru commented Jan 25, 2019

@delner Thanks for the quick response! Totally understand it needs more time to figure out a solid approach for user customization. As to the short-term stop-gap measure, what do you think about using <http method> <host>:<port> as the default out-of-box behavior?

@delner
Copy link
Contributor

delner commented Jan 25, 2019

Oh sorry, just to clarify, I think we want to wait on changing out-of-the-box behavior, as we work through the problem domain a bit more. In the interim, we could open a PR that adds user configurability, so users can have their applications produce the resource names they want while we work on a better default.

@ziru
Copy link

ziru commented Jan 25, 2019

@delner Got it! Will leave it up to the team to decide upon the final out-of-box behavior.
Hopefully, we won't include url_path, query params, etc as the default behavior, since those details might have high cardinality or even some sensitive details.

@jeremyolliver
Copy link

@delner I think introducing user configurable handling of resource names would be hugely helpful, even in the absence of changing the default. Being able to set something like:

Datadog.configure do |c|
  c.use :rails, service_name: 'myapp', distributed_tracing: true

  # Net::HTTP
  c.use :http, distributed_tracing: true, quantize: {|request|
    # Lambda/proc that returns the string to use as the resource name
    URI(request.uri).domain
  }
end

Would be fantastic.

@ziru
Copy link

ziru commented Mar 1, 2019

Any update on this issue? We setup APM at my company to improve system observability, and keep receiving questions from other eng teams about how to make sense of tracing data for outbound http calls.

@irabinovitch
Copy link

@delner Any thoughts on next steps here?

@jeremyolliver
Copy link

jeremyolliver commented Sep 2, 2019

Edit: moved my comment to #277 (comment) where the feature/issue is being discussed, rather than this community provided implementation

@delner
Copy link
Contributor

delner commented Sep 3, 2019

@jeremyolliver I replied on #277

@buddhistpirate
Copy link
Author

Followed the approach mentioned in #277

Implemented Contrib::HTTP::Instrumentation.after_request and changed the service name to something else. Works for me.

Closing this PR.

@tesssie
Copy link

tesssie commented Oct 19, 2023

Why is this feature discarded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member feature Involves a product feature integrations Involves tracing integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.