-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set host or address as resource for Net::HTTP #278
Conversation
c3d3492
to
cfbeb7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This change looks good, and this will be a big help to us. Currently resources under the net/http
are grouped under GET/POST/PUT etc, which makes tracking performance of API's and other systems we don't directly own really problematic
I've left a comment on a potential improvement, but even without this would be really useful to us.
lib/ddtrace/contrib/http/patcher.rb
Outdated
span.resource = req.method | ||
# Using the method as a resource, as URL/path can trigger | ||
# a possibly infinite number of resources. | ||
span.resource = host_address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be great for the majority of use-cases. It can be common to have multiple endpoints on localhost (e.g. using haproxy to load balance traffic, or multiple services such as elastic search).
Possibly the resource could be set to span.resource = "#{host_address}:#{host_port}"
?
@jeremyolliver Great idea! I just added it. |
Can someone rekick circleCI? It looks like it had some mongo issues. Should I rebase on a particular branch? |
You probably want to double-check you're fully rebased on master, and that |
Did the rebase. Looks like I might have caught CircleCI during the AWS S3 issues yesterday. Tests are green locally. I don't have another commit to add to kick a build. Could someone kick it off? |
@buddhistpirate Triggered a rebuild for you. |
@buddhistpirate Sorry to have to prompt you for this again but, I think you're going to have to rebase again on master. The |
Tracking external requests by host is more useful for understanding performance issues than tracking by HTTP method. By using the host we should have a smaller potential list than if we were tracking by URL/endpoint. We don't always have a hostname, so lets use logic that exists for falling back to address.
@delner Rebased and looks like the tests are passing! 🚀 |
@delner We also find this to be more useful than using the HTTP verb as the resource. What are your thoughts on updating this for all the HTTP client library integrations like Faraday and Excon? |
I think the change we have in this PR is good. We'd like to improve consistency of this resource naming across all of our clients (in other contribs like Faraday, as well as other libraries like Go/Python), so we're currently discussing what this should look like; this change here is being taken in consideration with that. I apologize for the delay on getting this merged, but I'll check back in here when I have more to share. |
@NobodysNightmare Partly yes, partly no. Yes in the sense that its meant to make query strings and parts of the URL consistent. No in the sense that its not used by every integration, nor does it yet adhere to any URL standard for the Datadog tracers. That URL standard is still under discussion last I checked, but when that's decided upon, we'll likely update that URL quantizer module to reflect that. The URL quantization module was created for exactly this purpose; such that every integration has a common means of applying URL quantization in a consistent manner with each other. So we would encourage its use once it fully reflects the standard we'd like to see. |
@delner What is this blocked on at present? I'm still really hanging out for this feature, and quantization for http resources appears to have landed (or at least there's an implementation in |
@jeremyolliver sorry this one is taking a while. The code itself isn't so much of an issue as it is the content of Last I checked we were blocked on determining some sort of standard for the resource names of HTTP requests like this. We wanted this to be consistent with each integration, and ideally wanted to avoid making a change here now that wouldn't fit with that standard under discussion (because then we'd have to change it again and we didn't want to "pull the rug" out from under users.) We recently started a discussion internally to determine what this standard should be, so this is under active consideration. One of the bigger challenges here is how to meaningfully group similar HTTP requests so users can get the benefit of statistics in a way that makes sense for their application. (By path? Host/port? Query string? Status code? There are a lot of variables here.) I think it's somewhat likely that the resource name will end up being more than just this verb (which I'm not happy with either), but I'm awaiting on that consensus for now. |
@delner thanks for asking the tough questions and trying to get this into a good spot for now and in the future. i'm curious, would you take a PR that allows this functionality to be more easily overridden by users of the library in the short-term or maybe permanently if that doesn't cause issues for the product roadmap? e.g. if we could pull the lines that determine the resource name into a method that can be redefined or a configurable lambda. |
@samandmoore For sure, I think that's how I want to deal with this in the near term; make it configurable so users can present the resource as they see fit. There's a more general feature we've implemented in other tracers, called "span hooks", which might help to this end. The idea would be to add callbacks which bind to a specific context to which users can subscribe and modify spans with. This prototype still needs to be fleshed out, and could be a while till its live in a release, but it is on my to-do list. In the long term, our solution is to simply have better defaults (as we were discussing above), and hopefully reduce (or ideally eliminate) the need for users to implement custom configuration for such things at all. The better defaults will take more time because those require consistency with our other tracers. |
I'm also a proponent of making it user configurable in the near term. Should I open a new PR for that, or is that something that your team @delner is going to take on? It seems like with the existing configuration system it might be easy enough to do, or at least make it possible via monkey patch/prepend. |
I think that's something I'll probably take on, given I wanted it to become a facet of the new Integration model we've been developing: presumably such a "span hook" feature would simply be a module that can be composed and re-used by any integration, e.g. HTTP but also Rails, Rack, etc... In doing so, hopefully the interface for implementing or configuring a span hook would be very consistent across all integrations. I would however welcome any suggestions, sample configuration or use-cases I can consider while developing a prototype. If we can develop a vision for how this feature should behave, or what it should allow for, that will help us build a better, more useful prototype. Sample code or other prototypical implementations would be most desirable. I'll make a case for promoting this issue on our backlog. |
@delner My team recently starts to evaluate datadog apm. One of the pain points during the piloting phase is the resource name set by the http integration, which makes pretty hard to track external services calls. The above discussion definitely makes sense to me. Is there any update on this issue? |
I feel the pain too; something I definitely want to improve upon. I think this is under broader discussion internally right now, as we want to provide a consistent experience with HTTP libraries across each of the different languages we support. I don't know when we'll have a feature to implement from that, but in the meantime, I'd be happy to consider stop-gap measures that at least make this configurable on the user-side. |
@delner Thanks for the quick response! Totally understand it needs more time to figure out a solid approach for user customization. As to the short-term stop-gap measure, what do you think about using |
Oh sorry, just to clarify, I think we want to wait on changing out-of-the-box behavior, as we work through the problem domain a bit more. In the interim, we could open a PR that adds user configurability, so users can have their applications produce the resource names they want while we work on a better default. |
@delner Got it! Will leave it up to the team to decide upon the final out-of-box behavior. |
@delner I think introducing user configurable handling of resource names would be hugely helpful, even in the absence of changing the default. Being able to set something like: Datadog.configure do |c|
c.use :rails, service_name: 'myapp', distributed_tracing: true
# Net::HTTP
c.use :http, distributed_tracing: true, quantize: {|request|
# Lambda/proc that returns the string to use as the resource name
URI(request.uri).domain
}
end Would be fantastic. |
Any update on this issue? We setup APM at my company to improve system observability, and keep receiving questions from other eng teams about how to make sense of tracing data for outbound http calls. |
@delner Any thoughts on next steps here? |
Edit: moved my comment to #277 (comment) where the feature/issue is being discussed, rather than this community provided implementation |
@jeremyolliver I replied on #277 |
Followed the approach mentioned in #277 Implemented Closing this PR. |
Why is this feature discarded |
Tracking external requests by host is more useful for understanding performance issues than tracking by HTTP method. By using the host we should have a smaller potential list than if we were tracking by URL/endpoint.
We don't always have a hostname, so lets use logic that exists for falling back to address.
Testing Notes: I didn't add scenarios for testing
req.uri.host
vs@address