Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: telemetry reorg #6639

Merged
merged 13 commits into from
Feb 11, 2025
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ context:
- telemetry
---

import TelemetryPerformanceNote from '../../../shared/telemetry-performance.mdx';
import TelemetryPerformanceNote from '../../../../shared/telemetry-performance.mdx';

Since the router is the single access point for all traffic to and from your graph, router telemetry is the most comprehensive way to observe your supergraph. By implementing telemetry, you can:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -202,9 +202,25 @@ telemetry:
response_header: "x-my-header"
```

## Event configuration example
## Event configuration reference

shorgi marked this conversation as resolved.
Show resolved Hide resolved
| Option | Values | Default | Description |
|--------------------|------------------------------------------------------------------------------|---------|-------------------------------------------------------------|
| `<attribute-name>` | | | The name of the custom attribute. |
| `attributes` | [standard attributes](/router/configuration/telemetry/instrumentation/standard-attributes) or [selectors](/router/configuration/telemetry/instrumentation/selectors) | | The attributes of the custom log event. |
| `condition` | [conditions](/router/configuration/telemetry/instrumentation/conditions) | | The condition that must be met for the event to be emitted. |
| `error` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the error log event. |
| `level` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the custom log event. |
| `message` | | | The message of the custom log event. |
| `on` | `request`\|`response`\|`error` | | When to trigger the event. |
| `request` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the request log event. |
| `response` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the response log event. |

## Event configuration examples

For example, the router service can be configured with standard events (`request`, `response`, `error`), and a custom event (`my.event`) with a condition:
### Standard and custom events

You can use both standard events and custom events in the same configuration. The example below has all the standard events (`request`, `response`, `error`) and one custom event (`my.event`) with a condition:

```yaml title="future.router.yaml"
telemetry:
Expand Down Expand Up @@ -237,17 +253,34 @@ telemetry:
# Custom event configuration for HTTP connectors ...
```

## Event configuration reference
### Debugging subscriptions

When developing and debugging the router, you might want to log all subscription events. The example configuration below logs all subscription events for both errors and data.

<Note>

Logs of all subscription errors and data may contain personally identifiable information (PII), so make sure not to log PII in your production environments and only enable it for development.

</Note>

| Option | Values | Default | Description |
|--------------------|------------------------------------------------------------------------------|---------|-------------------------------------------------------------|
| `<attribute-name>` | | | The name of the custom attribute. |
| `attributes` | [standard attributes](/router/configuration/telemetry/instrumentation/standard-attributes) or [selectors](/router/configuration/telemetry/instrumentation/selectors) | | The attributes of the custom log event. |
| `condition` | [conditions](/router/configuration/telemetry/instrumentation/conditions) | | The condition that must be met for the event to be emitted. |
| `error` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the error log event. |
| `level` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the custom log event. |
| `message` | | | The message of the custom log event. |
| `on` | `request`\|`response`\|`error` | | When to trigger the event. |
| `request` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the request log event. |
| `response` | `trace`\|`info`\|`warn`\|`error`\| `off` | `off` | The level of the response log event. |

```yaml title="router.yaml"
telemetry:
instrumentation:
events:
supergraph:
subscription.event:
message: subscription event
on: event_response # on every subscription event
level: info
# Only display event if it's a subscription event
condition:
eq:
- operation_kind: string
- subscription
attributes:
response.data:
response_data: $ # Display all the response data payload
response.errors:
response_errors: $ # Display all the response errors payload
```
Original file line number Diff line number Diff line change
Expand Up @@ -514,3 +514,43 @@ telemetry:
| `description` | | | The description of the custom instrument. |
| `value` | `unit` \| `duration` \| `<custom>` \| `event_unit` \| `event_duration` \| `event_custom` | | The value of the instrument. |

### Production instrumentation example

At minimum, observability of a router running in production requires knowing about errors that arise from operations and subgraphs.

The example configuration below adds instruments with both standard OpenTelemetry attributes and custom attributes to extract information about erring operations:

```yaml title="router.yaml"
telemetry:
instrumentation:
instruments:
router:
http.server.request.duration:
# Adding subgraph name, response status code from the router and the operation name
attributes:
http.response.status_code: true
graphql.operation.name:
operation_name: string
# This attribute will be set to true if the response contains graphql errors
graphql.errors:
on_graphql_error: true
http.server.response.body.size:
attributes:
graphql.operation.name:
operation_name: string
subgraph:
# Adding subgraph name, response status code from the subgraph and original operation name from the supergraph
http.client.request.duration:
attributes:
subgraph.name: true
http.response.status_code:
subgraph_response_status: code
graphql.operation.name:
supergraph_operation_name: string
# This attribute will be set to true if the response contains graphql errors
graphql.errors:
subgraph_on_graphql_error: true
http.client.request.body.size:
attributes:
subgraph.name: true
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,15 @@ context:

## Standard metric instruments

GraphOS Router and Apollo Router Core provide a set of non-configurable metric instruments that expose detailed information about the router's request lifecycle.
GraphOS Router and Apollo Router Core provide a set of standard router instruments that expose detailed information about the router's request lifecycle. You can consume the metrics they capture by configuring a [metrics exporter](/router/configuration/telemetry/exporters/metrics/overview).

These instruments can be consumed by configuring a [metrics exporter](/router/configuration/telemetry/exporters/metrics/overview).
Standard router instruments are different than OpenTelemetry (OTel) instruments or custom instruments:

- Router instruments provide standard metrics about the router request lifeycle and have names starting with `apollo.router` or `apollo_router`.
- OTel instruments provide metrics about the HTTP lifecycle and have names starting with `http`.
- Custom instruments provide customized metrics about the router request lifecycle.

The rest of this reference lists the available standard router instruments.

### GraphQL

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@ context:
- telemetry
---

import BatchProcessorPreamble from "../../../../../shared/batch-processor-preamble.mdx";
import BatchProcessorRef from "../../../../../shared/batch-processor-ref.mdx";

Enable and configure the [Jaeger exporter](https://www.jaegertracing.io/) for tracing in the GraphOS Router or Apollo Router Core.

For general tracing configuration, refer to [Router Tracing Configuration](/router/configuration/telemetry/exporters/tracing/overview).
Expand Down