To mitigate the potential flood of spans to a backend, agents SHOULD implement the strategies laid out in this section to avoid sending almost identical and very similar spans.
While compressing multiple similar spans into a single composite span can't fully eliminate the collection overhead, it can significantly reduce the impact on the following areas, with very little loss of information:
- Agent reporter queue utilization
- Capturing stack traces, serialization, compression, and sending events to APM Server
- Potential to re-use span objects, significantly reducing allocations
- Downstream effects like reducing impact on APM Server, ES storage, and UI performance
Setting this option to true will enable span compression feature. Span compression reduces the collection, processing, and storage overhead, and removes clutter from the UI. The tradeoff is that some information such as DB statements of all the compressed spans will not be collected.
Type | boolean |
Default | true |
Dynamic | true |
One of the biggest sources of excessive data collection are n+1 type queries and repetitive requests to a cache server. This strategy detects consecutive spans that hold the same information (except for the duration) and creates a single composite span.
[ ]
GET /users
[] [] [] [] [] [] [] [] [] []
10x SELECT FROM users
Two spans are considered to be an exact match if they are of the same kind and if their span names are equal:
type
subtype
destination.service.resource
(DEPRECATED, replaced byservice.target.{type,name}
)service.target.type
service.target.name
name
Consecutive spans that are exact match and that are under this threshold will be compressed into a single composite span. This option does not apply to composite spans. This reduces the collection, processing, and storage overhead, and removes clutter from the UI. The tradeoff is that the DB statements of all the compressed spans will not be collected.
Type | duration |
Default | 50ms |
Dynamic | true |
Another pattern that often occurs is a high amount of alternating queries to the same backend. Especially if the individual spans are quite fast, recording every single query is likely to not be worth the overhead.
[ ]
GET /users
[] [] [] [] [] [] [] [] [] []
10x Calls to mysql
Two spans are considered to be of the same type if the following properties are equal:
type
subtype
destination.service.resource
(DEPRECATED, replaced byservice.target.{type,name}
)service.target.type
service.target.name
boolean isSameKind(Span other) {
return type == other.type
&& subtype == other.subtype
&& service.target.type == other.service.target.type
&& service.target.name == other.service.target.name
}
When applying this compression strategy, the span.name
is set to Calls to <...>
determined from service.target.*
as follows:
function getCompositeSpanName(serviceTarget) {
const prefix = 'Calls to '
if (!serviceTarget.type) {
if (!serviceTarget.name) {
return prefix + 'unknown'
} else {
return prefix + serviceTarget.name
}
} else if (!serviceTarget.name) {
return prefix + serviceTarget.type
} else {
return prefix + serviceTarget.type + '/' + serviceTarget.name
}
}
The rest of the context, such as the db.statement
will be determined by the first compressed span, which is turned into a composite span.
Consecutive spans to the same destination that are under this threshold will be compressed into a single composite span. This option does not apply to composite spans. This reduces the collection, processing, and storage overhead, and removes clutter from the UI. The tradeoff is that the DB statements of all the compressed spans will not be collected.
Type | duration |
Default | 0ms |
Dynamic | true |
Compressed spans don't have a physical span document. Instead, multiple compressed spans are represented by a composite span.
The timestamp
and duration
have slightly similar semantics,
and they define properties under the composite
context.
timestamp
: The start timestamp of the first span.duration
: gross duration (i.e., <last compressed span's end timestamp> - <first compressed span's start timestamp>).composite
count
: The number of compressed spans this composite span represents. The minimum count is 2 as a composite span represents at least two spans.sum
: sum of durations of all compressed spans this composite span represents in milliseconds. Thussum
is the net duration of all the compressed spans whileduration
is the gross duration (including "whitespace" between the spans).compression_strategy
: A string value indicating which compression strategy was used. The valid values are:exact_match
- Consecutive-Exact-Match compression strategysame_kind
- Consecutive-Same-Kind compression strategy
As laid out in the span destination spec,
APM Server tracks span destination metrics.
To avoid compressed spans to skew latency metrics and cause throughput metrics to be under-counted,
APM Server will take composite.count
into account when tracking span destination metrics.
Effects on span count
When a span is compressed into a composite,
span_count.reported
should ONLY count the compressed composite as a single
span.
Spans that have been compressed into the composite should not be counted.
A span is eligible for compression if all the following conditions are met
- It's an exit span
- The trace context of this span has not been propagated to a downstream service
- If the span has
outcome
(i.e.,outcome
is present and it's notnull
) then it should besuccess
. It means spans with outcome indicating an issue of potential interest should not be compressed.
The second condition is important so that we don't remove (compress) a span that may be the parent of a downstream service. This would orphan the sub-graph started by the downstream service and cause it to not appear in the waterfall view.
boolean isCompressionEligible() {
return exit && !context.hasPropagated && (outcome == null || outcome == "success")
}
When a span ends, if it is not compression-eligible or if its parent has already ended, it may be reported immediately. Otherwise, it does not immediately get reported. Instead, the span is buffered within its parent. A span/transaction can buffer at most one child span.
Span buffering allows to "look back" one span when determining whether a given span should be compressed.
A buffered span gets reported when
- its parent ends
- a non-compressible sibling ends
void onEnd() {
if (buffered != null) {
report(buffered)
}
}
void onChildEnd(Span child) {
if (ended || !child.isCompressionEligible()) {
if (buffered != null) {
report(buffered)
buffered = null
}
report(child)
return
}
if (buffered == null) {
buffered = child
return
}
if (!buffered.tryToCompress(child)) {
report(buffered)
buffered = child
}
}
Spans have tryToCompress
method that is called on a span buffered by its parent.
On the first call the span checks if it can be compressed with the given sibling and it selects the best compression strategy.
Note that the compression strategy selected only once based on the first two spans of the sequence.
The compression strategy cannot be changed by the rest the spans in the sequence.
So when the current sibling span cannot be added to the ongoing sequence under the selected compression strategy
then the ongoing is terminated, it is sent out as a composite span and the current sibling span is buffered.
If the spans are of the same kind, and have the same name and both spans duration
<= span_compression_exact_match_max_duration
,
we apply the Consecutive-Exact-Match compression strategy.
Note that if the spans are exact match
but duration threshold requirement is not satisfied we just stop compression sequence.
In particular it means that the implementation should not proceed to try same kind strategy.
Otherwise user would have to lower both span_compression_exact_match_max_duration
and span_compression_same_kind_max_duration
to prevent longer exact match spans from being compressed.
If the spans are of the same kind but have different span names and both spans duration
<= span_compression_same_kind_max_duration
,
we compress them using the Consecutive-Same-Kind compression strategy.
bool tryToCompress(Span sibling) {
isAlreadyComposite = composite != null
canBeCompressed = isAlreadyComposite ? tryToCompressComposite(sibling) : tryToCompressRegular(sibling)
if (!canBeCompressed) {
return false
}
if (!isAlreadyComposite) {
composite.count = 1
composite.sum = duration
}
++composite.count
composite.sum += other.duration
return true
}
bool tryToCompressRegular(Span sibling) {
if (!isSameKind(sibling)) {
return false
}
if (name == sibling.name) {
if (duration <= span_compression_exact_match_max_duration && sibling.duration <= span_compression_exact_match_max_duration) {
composite.compressionStrategy = "exact_match"
return true
}
return false
}
if (duration <= span_compression_same_kind_max_duration && sibling.duration <= span_compression_same_kind_max_duration) {
composite.compressionStrategy = "same_kind"
name = "Calls to " + destination.service.resource
return true
}
return false
}
bool tryToCompressComposite(Span sibling) {
switch (composite.compressionStrategy) {
case "exact_match":
return isSameKind(sibling) && name == sibling.name && sibling.duration <= span_compression_exact_match_max_duration
case "same_kind":
return isSameKind(sibling) && sibling.duration <= span_compression_same_kind_max_duration
}
}
The pseudo-code in this spec is intentionally not written in a thread-safe manner to make it more concise. Also, thread safety is highly platform/runtime dependent, and some don't support parallelism or concurrency.
However, if there can be a situation where multiple spans may end concurrently, agents MUST guard against race conditions. To do that, agents should prefer lock-free algorithms paired with retry loops over blocking algorithms that use mutexes or locks.
In particular, operations that work with the buffer require special attention:
- Setting a span into the buffer must be handled atomically.
- Retrieving a span from the buffer must be handled atomically.
Retrieving includes atomically getting and clearing the buffer.
This makes sure that only one thread can compare span properties and call mutating methods, such as
compress
at a time.