From a2f224f12f65f9ec5831cda72ac7ffb12972e91e Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Mon, 13 Jan 2025 21:01:11 +0000 Subject: [PATCH 1/2] [release-v2.7] [DOC] Add zone-aware ingesters doc (#4548) Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com> --- .../tempo/operations/zone-aware-ingesters.md | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 docs/sources/tempo/operations/zone-aware-ingesters.md diff --git a/docs/sources/tempo/operations/zone-aware-ingesters.md b/docs/sources/tempo/operations/zone-aware-ingesters.md new file mode 100644 index 00000000000..14155948b59 --- /dev/null +++ b/docs/sources/tempo/operations/zone-aware-ingesters.md @@ -0,0 +1,26 @@ +--- +title: Zone-aware replication for ingesters +menuTitle: Zone-aware ingesters +description: Configure zone-aware ingesters for Tempo +weight: 900 +--- + +# Zone-aware replication for ingesters + +Zone awareness is a feature that ensures data is replicated across failure domains (which we refer to as “zones”) to provide greater reliability. +A failure domain is whatever you define it to be, but commonly may be an availability zone, data center, or server rack. + +When zone awareness is enabled for ingesters, incoming trace data is guaranteed to be replicated to ingesters in different zones. +This allows the system to withstand the loss of one or more zones (depending on the replication factor). + +Example: + +```yaml +# use the following fields in _config field of JSonnet config, to enable zone-aware ingesters. + multi_zone_ingester_enabled: false, + multi_zone_ingester_migration_enabled: false, + multi_zone_ingester_replicas: 0, + multi_zone_ingester_max_unavailable: 25, +``` + +For an configuration, refer to the [JSonnet microservices operations example](https://github.com/grafana/tempo/blob/main/operations/jsonnet/microservices/README.md) \ No newline at end of file From f49681091c451cd0fdcb0b9155bab088cbf53436 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Mon, 13 Jan 2025 21:16:12 +0000 Subject: [PATCH 2/2] [release-v2.7] [DOC] Update examples to TraceQL doc (#4547) Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com> Co-authored-by: Mario --- docs/sources/tempo/traceql/_index.md | 247 +++++++++++++++------------ 1 file changed, 140 insertions(+), 107 deletions(-) diff --git a/docs/sources/tempo/traceql/_index.md b/docs/sources/tempo/traceql/_index.md index 52a07a7c66d..2d3071b05b1 100644 --- a/docs/sources/tempo/traceql/_index.md +++ b/docs/sources/tempo/traceql/_index.md @@ -63,12 +63,127 @@ In this example, the search reduces traces to those spans where: * `http.status_code` is in the range of `200` to `299` and * the number of matching spans within a trace is greater than two. -Queries select sets of spans and filter them through a pipeline of aggregators and conditions. If, for a given trace, this pipeline produces a spanset then it is included in the results of the query. +Queries select sets of spans and filter them through a pipeline of aggregators and conditions. +If, for a given trace, this pipeline produces a spanset then it's included in the results of the query. + +Refer to [TraceQL metrics queries](https://grafana.com/docs/tempo//traceql/metrics-queries/) for examples of TraceQL metrics queries. + +### Find traces of a specific operation + +Let's say that you want to find traces of a specific operation, then both the operation name (the span attribute `name`) and the name of the service that holds this operation (the resource attribute `service.name`) should be specified for proper filtering. +In the example below, traces are filtered on the `resource.service.name` value `frontend` and the span `name` value `POST /api/order`: + +``` +{resource.service.name = "frontend" && name = "POST /api/orders"} +``` + +When using the same Grafana stack for multiple environments (for example, `production` and `staging`) or having services that share the same name but are differentiated though their namespace, the query looks like: + +``` +{ + resource.service.namespace = "ecommerce" && + resource.service.name = "frontend" && + resource.deployment.environment = "production" && + name = "POST /api/orders" +} +``` + +### Find traces having a particular outcome + +This example finds all traces on the operation `POST /api/orders` that have a span that has errored: + +``` +{ + resource.service.name="frontend" && + name = "POST /api/orders" && + status = error +} +``` + +This example finds all traces on the operation `POST /api/orders` that return with an HTTP 5xx error: + +``` +{ + resource.service.name="frontend" && + name = "POST /api/orders" && + span.http.status_code >= 500 +} +``` + +### Find traces that have a particular behavior + +You can use query filtering on multiple spans of the traces. +This example locates all the traces of the `GET /api/products/{id}` operation that access a database. It's a convenient request to identify abnormal access ratios to the database caused by caching problems. + +``` +{span.service.name="frontend" && name = "GET /api/products/{id}"} && {.db.system="postgresql"} +``` + +### Find traces going through `production` and `staging` instances + +This example finds traces that go through `production` and `staging` instances. +It's a convenient request to identify misconfigurations and leaks across production and non-production environments. + +``` +{ resource.deployment.environment = "production" } && { resource.deployment.environment = "staging" } +``` + +### Use structural operators + +Find traces that include the `frontend` service, where either that service or a downstream service includes a span where an error is set. + +``` +{ resource.service.name="frontend" } >> { status = error } +``` + +Find all leaf spans that end in the `productcatalogservice`. + +``` +{ } !< { resource.service.name = "productcatalogservice" } +``` + +Find if `productcatalogservice` and `frontend` are siblings. + +``` +{ resource.service.name = "productcatalogservice" } ~ { resource.service.name="frontend" } +``` + +### Other examples + +Find the services where the http status is 200, and list the service name the span belongs to along with returned traces. + +``` +{ span.http.status_code = 200 } | select(resource.service.name) +``` + +Find any trace with an unscoped `deployment.environment` attribute set to `production` and `http.status_code` attribute set to `200`: + +``` +{ .deployment.environment = "production" && span.http.status_code = 200 } +``` + +Find any trace where spans within it have a `deployment.environment` resource attribute set to `production` and a span `http.status_code` attribute set to `200`. In previous examples, all conditions had to be true on one span. These conditions can be true on either different spans or the same spans. + +``` +{ resource.deployment.environment = "production" } && { span.http.status_code = 200 } +``` + +Find any trace where any span has an `http.method` attribute set to `GET` as well as a `status` attribute set to `ok`, and where any other span has an `http.method` attribute set to `DELETE`, but doesn't have a `status` attribute set to `ok`: + +``` +{ span.http.method = "GET" && status = ok } && { span.http.method = "DELETE" && status != ok } +``` + +Find any trace with a `deployment.environment` attribute that matches the regex `prod-.*` and `http.status_code` attribute set to `200`: + +``` +{ resource.deployment.environment =~ "prod-.*" && span.http.status_code = 200 } +``` ## Selecting spans -In TraceQL, curly brackets `{}` always select a set of spans from the current trace. -They are commonly paired with a condition to reduce the spans being passed in. +In TraceQL, curly brackets `{}` always select a set of spans from available traces. +Curly brackets are commonly paired with a condition to reduce the spans fetched. TraceQL differentiates between two types of span data: intrinsics, which are fundamental to spans, and attributes, which are customizable key-value pairs. You can use intrinsics and attributes to build filters and select spans. @@ -92,6 +207,8 @@ Intrinsics example: Custom attributes are prefixed with `.`, such as `span.`, `resource.` , `link.`, or `event`. Resource has no intrinsic values. It only has custom attributes. + +Attributes are separated by a period (`.`), and intrinsic fields use a colon (`:`). The `trace` scope is only an intrinsic and doesn't have any custom attributes at the trace level. Attributes example: @@ -117,11 +234,11 @@ The following table shows the current available scoped intrinsic fields: | `trace:duration` | duration | max(end) - min(start) time of the spans in the trace | `{ trace:duration > 100ms }` | | `trace:rootName` | string | if it exists, the name of the root span in the trace | `{ trace:rootName = "HTTP GET" }` | | `trace:rootService` | string | if it exists, the service name of the root span in the trace | `{ trace:rootService = "gateway" }` | -| `trace:id` | string | trace id using hex string | `{ trace:id = "1234567890abcde" }` | +| `trace:id` | string | trace ID using hex string | `{ trace:id = "1234567890abcde" }` | | `event:name` | string | name of event | `{ event:name = "exception" }` | | `event:timeSinceStart` | duration | time of event in relation to the span start time | `{ event:timeSinceStart > 2ms}` | -| `link:spanID` | string | link span id using hex string | `{ link:spanID = "0000000000000001" }` | -| `link:traceID` | string | link trace id using hex string | `{ link:traceID = "1234567890abcde" }` | +| `link:spanID` | string | link span ID using hex string | `{ link:spanID = "0000000000000001" }` | +| `link:traceID` | string | link trace ID using hex string | `{ link:traceID = "1234567890abcde" }` |