-
Notifications
You must be signed in to change notification settings - Fork 27
Add option for limiting the number of logs in a Span #39
Conversation
f95bbea
to
e4ed431
Compare
e4ed431
to
ec9d41e
Compare
if maxLogs := s.tracer.options.MaxLogsPerSpan; maxLogs > 0 && len(s.raw.Logs) == maxLogs { | ||
// We have too many logs. Overwrite the earliest log (use Logs as a circular | ||
// buffer). | ||
s.raw.Logs[s.numDroppedLogs%maxLogs] = ld |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like us to drop from the middle of the span log, not from the beginning. A log would read something like:
log 0
log 1
log 2
** dropped 5 events **
log 8
log 9
log 10
log 11
Since this is a significant change, we should plan to do it consistently across basictracer libraries. I'd like to get @bhs and @yurishkuro to comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Would that be about half of max before the dropped ones, and half after?
It's easy enough to implement whichever policy on my end; I will wait until there is agreement. It might be easier to access the discussion post-mortem if it's in #38 instead of here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make this efficient, how about treating the 2nd half of the array as a circular buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what we would do (if folks agree on that policy).
// dropped event. | ||
} | ||
|
||
// The MaxLogsPerSpan limit does not apply to BulkLogData. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of limiting is to cap memory usage. If the caller is supplying a huge BulkLogData
, they are already storing that in memory. That said, I don't feel strongly - we can make those go through the normal limiting logic just as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough - can you add this to the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
@@ -87,7 +94,8 @@ type Options struct { | |||
// returned object with a Tracer. | |||
func DefaultOptions() Options { | |||
return Options{ | |||
ShouldSample: func(traceID uint64) bool { return traceID%64 == 0 }, | |||
ShouldSample: func(traceID uint64) bool { return traceID%64 == 0 }, | |||
MaxLogsPerSpan: 100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RaduBerinde you changed your mind about default being infinite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as per the discussion in #38
} | ||
// Replace the oldest event. This means we are effectively dropping one more | ||
// event. | ||
s.raw.Logs[0].Event = fmt.Sprintf("** dropped %d events **", s.numDroppedLogs+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps this makes more sense as a tag than a log
@yurishkuro If it's a tag, then we don't know where the dropped logs span On Mon, Sep 26, 2016 at 2:27 PM, Yuri Shkuro notifications@github.com
|
Also, in some cases tags might not be visible everywhere where logs are. A prime example is |
hm, if your limit is 10 and you dropped 50 logs, how exactly do you know where the dropped ones begin? My reasoning is that the number of dropped logs is an attribute of the whole span, attaching a timestamp to that information (via log) does not have a lot of meaning. |
I think what was meant was that the message makes it obvious which logs were dropped in relation to the ones which were not dropped. In the current change, the |
I was thinking that a limit serves two purposes: As for the timestamp, my proposal was that by inserting a log record at the point were logs are dropped, at least the user can see two complete subsections of the log, the beginning up until the drop point, and the drop point to the end. I.e., there are two pieces of information, how many were dropped and which time range was dropped. |
Sure, I will apply the limit to the bulk data as well. Also, is the default limit of 100 ok? (I don't feel strongly about any particular value, I would just avoid setting it too small given that it will change the behavior of current users) |
} | ||
// Replace the oldest event. This means we are effectively dropping one more | ||
// event. | ||
s.raw.Logs[0].Event = fmt.Sprintf("** dropped %d events **", s.numDroppedLogs+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, you have unlucky timing... this is not going to compile at master
since the OpenTracing KV logging thing just landed today.
Thankfully, this is a good case for KV logging :)
Are there any objections to @jmacd's proposal to drop the events in the "middle"? If not, I will work on that. |
@RaduBerinde I think @jmacd's "middle-out" (har har har) proposal is the optimal thing if you can spare the time. We did something similar with Dapper's version of this change and it worked well. |
ec9d41e
to
d0b305a
Compare
Updated to "middle-out" :) One thing I'm not sure is how to "format" the "dropped logs" log record (see span.go:214). |
d0b305a
to
a0803ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: I didn't even try to review the possible off-by-one cases here. Aside from that caveat, the intent (and the default of 100) LGTM.
numNew := len(s.raw.Logs) - numOld | ||
rotateLogBuffer(s.raw.Logs[numOld:], s.numDroppedLogs%numNew) | ||
|
||
// Replace the log in the middle (the earliest "new" log) with information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
terminology nit: can you do s/earliest/oldest/
? It's an awkward term, but the only one that's unambiguous IMO. My brain things "early" means "new", though of course in this case they're opposite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
// Keep the timestamp of the last dropped event. | ||
Timestamp: s.raw.Logs[numOld].Timestamp, | ||
Fields: []log.Field{ | ||
otlog.String("event", "** dropped events **"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my two cents:
[]log.Field{
otlog.String("event", "dropped Span logs"),
otlog.Int("dropped_log_count", s.numDroppedLogs+1),
otlog.String("component", "basictracer"),
},
The "component" thing is just an idea... I think it'll make the error easier to track down for an unsuspecting tracing system user, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS: what's the +1
for here? (I just copied it from you)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding component. +1 is explained by the comment above - we are overwriting an event, hence dropping one more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the +1 right next to the comment.
fcd1b20
to
a67d8c1
Compare
I see that Lightstep only shows the value of the |
@RaduBerinde no, that is a lightstep limitation and it should be fixed on that side. |
This LGTM... I'd like to wait for @yurishkuro and @jmacd to give a 👍 before merging, though. |
@@ -7,6 +7,7 @@ import ( | |||
opentracing "github.com/opentracing/opentracing-go" | |||
"github.com/opentracing/opentracing-go/ext" | |||
"github.com/opentracing/opentracing-go/log" | |||
otlog "github.com/opentracing/opentracing-go/log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, fixed.
a67d8c1
to
492a88e
Compare
@@ -5,6 +5,7 @@ import ( | |||
"reflect" | |||
"testing" | |||
|
|||
"github.com/cockroachdb/cockroach/util/caller" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather we didn't pull a new dependency purely for a nicer error message that never happens in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, definitely, didn't mean to do that. goimports
is enabling my habits :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to runtime.Caller()
492a88e
to
78912f6
Compare
@RaduBerinde you'll need to rebase this one... |
Updated. Thanks for the reviews! |
LGTM. I'll give others until tomorrow to raise any other concerns before merging (esp @jmacd since @yurishkuro already approved the change). Thanks again! |
The MaxLogsPerSpan option limits the number of logs in a Span. Events are dropped as necessary; a log in the finished span indicates how many were dropped. The default limit is 100. Fixes opentracing#38.
79ee6db
to
723bb40
Compare
We added log limits to basictracer in opentracing/basictracer-go#39 This change sets better options for our tracers: - drop all logs for spans which we are not recording (those that go net/trace) - increase the default value of the new maxLogsPerSpan setting to 1000 Fixes cockroachdb#9529.
We added log limits to basictracer in opentracing/basictracer-go#39 This change sets better options for our tracers: - drop all logs for spans which we are not recording (those that go net/trace) - increase the default value of the new maxLogsPerSpan setting to 1000 Fixes cockroachdb#9529.
The MaxLogsPerSpan option limits the number of logs in a Span. Oldest
events are dropped as necessary. The first log in the finished span indicates how
many were dropped.
The default limit is 100.
Fixes #38.
CC @tschottdorf @bensigelman