-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enables duration queries by promoting Span.duration, timestamp #807
Conversation
* example, SERVER_SEND.timestamp - SERVER_RECV.timestamp. | ||
* | ||
* Note that this should be treated unsigned. i32 implies trace durations are | ||
* not longer than 35.79 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's a reasonable limitation. Just the other day I had a discussion with the data team where they sometimes have spans measured in hours. I would go for i64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
The last sentence does not follow from the requirements. Yes, we do need an index, but we don't need to capture start/duration in the collected spans, we just need a way to calculate it in the collector. We could use cs/cr for that, provided they reach the collector in a single submission from instrumentation. |
These are my concerns with adding the fields to thrift:
|
btw, what is your concern with having these two fields be implementation-specific? The span API that query service deals with can have them as def's, and storage implementations fill it. In many cases the actual calculation should be the same, e.g. if it's from annotations, so that logic can be refactored to a base class or composition. I thought right now it it's already in the shared class. |
ok, I'll shut up, go for it. I believe in my system we may be able to calculate those fields in the collector. |
Glad it might work. Just added some more detail around how I see this fitting in with local spans. PTAL, as it may make more sense within this context #808 |
55fbd8a
to
41a3e74
Compare
Re-summarized in the thrift definition and WIP description. I removed my related redundant comments. |
41a3e74
to
ae4d077
Compare
This moves existing code around the notion of Span.timestamp, Span.duration discussed in #807. The impact from a user POV is minimal. `endTs`, used by the query and UI formerly looked for the last timestamp in a trace. What this meant is that if someone clicked search, waited, then clicked search again with the same `endTs`, an in-flight trace may "disappear" if it has new activity. Since `endTs` is now based on a stable point (the start), a trace wouldn't disappear anymore. This impact is so subtle that it is barely worth discussing. The primary motivation for this change is to simplify the commodity task of timestamping and duration stamping spans. This is discussed #807, and directly supports a new minimal design of local spans (#808).
Note I plan to start on this today, as soon as the supporting changes are released as 1.21.1 |
ETA tomorrow! |
expect an update tomorrow pacific AM. Just polishing tests |
Before this change, Span.timestamp and duration were always derived lazily from annotations. This decouples that logic by converting the scala methods to vals and populating them with `ApplyTimestampAndDuration`. This serves us in at least two ways. First, the implicit association between annotations and timestamp or duration was tested in various components. Making these explicit centralizes the responsibility, and lowers the test burden on other components. Also, we want to formalize these fields in persisted models in support of duration queries and local spans (#807). Organizing logic ahead of this work makes the change to persistence simpler.
Before this change, Span.timestamp and duration were always derived lazily from annotations. This decouples that logic by converting the scala methods to vals and populating them with `ApplyTimestampAndDuration`. This serves us in at least two ways. First, the implicit association between annotations and timestamp or duration was tested in various components. Making these explicit centralizes the responsibility, and lowers the test burden on other components. Also, we want to formalize these fields in persisted models in support of duration queries and local spans (#807). Organizing logic ahead of this work makes the change to persistence simpler.
With this in place, instrumentation can send timestamp and duration explicitly, which facilitates local spans or other spans who don't log RPC annotations.
ae4d077
to
80fba00
Compare
ok. this is finally complete. Once merged, I'll raise a pull request for duration query. |
Enables duration queries by promoting Span.duration, timestamp
A commonly requested feature is querying by duration, for example showing traces longer than 1 minute.
While the UI supports ordering by duration, it is a game of chance whether the traces returned happen to be long or not. Even if you put a limit of 1000, you still may not find the longest trace (as your storage system may have a million traces in it).
The path proposed is top-level Span.timestamp, and Span.duration, which formerly existed in the scala and mustache models, but not in the thrift.
We could implement duration queries without changing thrifts. For example, this could be an implementation detail of each storage system. However, this limits testing to side-effects and keeps essential details in scala code, making portability harder.
Also, we have other reasons to top-level these fields:
Here's a convenience paste of the definitions of Span.timestamp and Span.duration