Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/filter] Add telemetry for dropped metrics, logs, and spans #29081

Merged
merged 22 commits into from
Dec 19, 2023

Conversation

MacroPower
Copy link
Contributor

@MacroPower MacroPower commented Nov 9, 2023

Description:

Adds telemetry for metrics, logs, and spans that were intentionally dropped via a filterprocessor. Specifically, the following metrics are added:

otelcol_processor_filter_datapoints_filtered
otelcol_processor_filter_logs_filtered
otelcol_processor_filter_spans_filtered

Please let me know any feedback/thoughts on the naming or anything else!

Link to tracking Issue: #13169

Testing: I've used batchprocessor as an example for a couple of tests, Filter*ProcessorTelemetryWithOC. I kept the wrapping code so that OTEL versions can be easily added when that is ready in contrib. Overall the tests are not super comprehensive and I could improve them if needed, but as-is they were helpful for debugging.

Additionally, here's some stuff you can use for manually testing.

There might be a better way to do this, but I just used hostmetrics, filelog, and this article from honeycomb with otlp/http.

Note, this should be run from the root of the contrib repo.

Add/overwrite local/config.yaml, local/span.json, and run:

mkdir -p local

cat >local/config.yaml <<EOL
receivers:
  hostmetrics:
    collection_interval: 30s
    initial_delay: 1s
    scrapers:
      load:
  filelog:
    include:
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"foo"}' >> /tmp/otel-test.log
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"bar"}' >> /tmp/otel-test.log
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"baz"}' >> /tmp/otel-test.log
      - /tmp/otel-test.log
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.timestamp
          layout: "%Y-%m-%d %H:%M:%S"
  otlp:
    protocols:
      ## curl -i http://localhost:4318/v1/traces -X POST -H "Content-Type: application/json" -d @local/span.json
      http:

processors:
  filter/test:
    metrics:
      metric:
        # Should drop 2 of the 3 metrics, 5m average remains
        - 'name=="system.cpu.load_average.1m"'
        - 'name=="system.cpu.load_average.15m"'
    logs:
      log_record:
        # Should filter out "bar" and "baz"
        - 'IsMatch(body, ".*ba.*")'
    traces:
      span:
        # Should drop 1 of the 2 spans
        - 'name == "foobar"'

exporters:
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 200

service:
  extensions: []
  pipelines:
    metrics:
      receivers: [hostmetrics]
      processors: [filter/test]
      exporters: [debug]
    logs:
      receivers: [filelog]
      processors: [filter/test]
      exporters: [debug]
    traces:
      receivers: [otlp]
      processors: [filter/test]
      exporters: [debug]

  telemetry:
    logs:
      level: debug
    metrics:
      level: detailed
      address: 0.0.0.0:8888
EOL

cat >local/span.json <<EOL
{
  "resourceSpans": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": {
              "stringValue": "test-with-curl"
            }
          }
        ]
      },
      "scopeSpans": [
        {
          "scope": {
            "name": "manual-test"
          },
          "spans": [
            {
              "traceId": "71699b6fe85982c7c8995ea3d9c95df2",
              "spanId": "3c191d03fa8be065",
              "name": "spanitron",
              "kind": 2,
              "droppedAttributesCount": 0,
              "events": [],
              "droppedEventsCount": 0,
              "status": {
                "code": 1
              }
            },
            {
              "traceId": "71699b6fe85982c7c8995ea3d9c95df2",
              "spanId": "2f357b34d32f77b4",
              "name": "foobar",
              "kind": 2,
              "droppedAttributesCount": 0,
              "events": [],
              "droppedEventsCount": 0,
              "status": {
                "code": 1
              }
            }
          ]
        }
      ]
    }
  ]
}
EOL

make run

Send some data to the receivers:

# Write some logs
echo '{"timestamp":"2023-12-18 12:00:00","msg":"foo"}' >> /tmp/otel-test.log
echo '{"timestamp":"2023-12-18 12:00:00","msg":"bar"}' >> /tmp/otel-test.log
echo '{"timestamp":"2023-12-18 12:00:00","msg":"baz"}' >> /tmp/otel-test.log

# Write some spans
curl -i http://localhost:4318/v1/traces -X POST -H "Content-Type: application/json" -d @local/span.json

Check the results:

$ curl http://localhost:8888/metrics | grep filtered
# HELP otelcol_processor_filter_datapoints_filtered Number of metric data points dropped by the filter processor
# TYPE otelcol_processor_filter_datapoints_filtered counter
otelcol_processor_filter_datapoints_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 2
# HELP otelcol_processor_filter_logs_filtered Number of logs dropped by the filter processor
# TYPE otelcol_processor_filter_logs_filtered counter
otelcol_processor_filter_logs_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 2
# HELP otelcol_processor_filter_spans_filtered Number of spans dropped by the filter processor
# TYPE otelcol_processor_filter_spans_filtered counter
otelcol_processor_filter_spans_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 1

Documentation: I do not believe we document telemetry exposed by components, but I could add this if needed.

Copy link

linux-foundation-easycla bot commented Nov 9, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions bot added the processor/filter Filter processor label Nov 9, 2023
processor/filterprocessor/metrics.go Outdated Show resolved Hide resolved
processor/filterprocessor/telemetry.go Outdated Show resolved Hide resolved
processor/filterprocessor/metrics.go Outdated Show resolved Hide resolved
@MacroPower MacroPower marked this pull request as ready for review November 19, 2023 05:46
@MacroPower MacroPower requested review from a team and dashpole November 19, 2023 05:46
@MacroPower
Copy link
Contributor Author

MacroPower commented Nov 19, 2023

It took a while but I think this is ready for review for real. Thanks for the feedback on my draft. Also wanted to say that Contribfest was really awesome! I had a ton of fun working with everyone, and I'm also glad to finally work on the collector a bit considering how frequently I use it day-to-day. 🙂

@MacroPower MacroPower changed the title Add telemetry for filterprocessor [processor/filterprocessor] Add telemetry for dropped metrics, logs, and spans Nov 19, 2023
@MacroPower MacroPower changed the title [processor/filterprocessor] Add telemetry for dropped metrics, logs, and spans [processor/filter] Add telemetry for dropped metrics, logs, and spans Nov 19, 2023
@TylerHelmuth
Copy link
Member

@MacroPower can you share some details on the new metrics in action from your manual tests?

Copy link
Contributor

github-actions bot commented Dec 5, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 5, 2023
# Conflicts:
#	processor/filterprocessor/go.mod
#	processor/filterprocessor/go.sum
Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking with this!

processor/filterprocessor/telemetry.go Outdated Show resolved Hide resolved
@MacroPower
Copy link
Contributor Author

@MacroPower can you share some details on the new metrics in action from your manual tests?

@TylerHelmuth I edited my PR description to include some stuff that can be used for manual testing. It's not the best and there's probably a better way I imagine, but hopefully it's useful regardless. 🙂

@github-actions github-actions bot removed the Stale label Dec 19, 2023
Copy link
Member

@TylerHelmuth TylerHelmuth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MacroPower thanks for the testing details, I was able to confirm locally as well. Take care of the conflicts and we'll be good to merge.

# Conflicts:
#	processor/filterprocessor/go.mod
#	processor/filterprocessor/go.sum
@TylerHelmuth TylerHelmuth added the ready to merge Code review completed; ready to merge by maintainers label Dec 19, 2023
@TylerHelmuth TylerHelmuth merged commit 763d426 into open-telemetry:main Dec 19, 2023
85 checks passed
@github-actions github-actions bot added this to the next release milestone Dec 19, 2023
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this pull request Jan 10, 2024
…open-telemetry#29081)

**Description:**

Adds telemetry for metrics, logs, and spans that were intentionally
dropped via a `filterprocessor`. Specifically, the following metrics are
added:

`otelcol_processor_filter_datapoints_filtered`
`otelcol_processor_filter_logs_filtered`
`otelcol_processor_filter_spans_filtered`

Please let me know any feedback/thoughts on the naming or anything else!

**Link to tracking Issue:** open-telemetry#13169

**Testing:** I've used batchprocessor as an example for a couple of
tests, Filter*ProcessorTelemetryWithOC. I kept the wrapping code so that
OTEL versions can be easily added when that is ready in contrib. Overall
the tests are not super comprehensive and I could improve them if
needed, but as-is they were helpful for debugging.

<details>
<summary><i>Additionally, here's some stuff you can use for manually
testing.</i></summary>

There might be a better way to do this, but I just used hostmetrics,
filelog, and [this article from
honeycomb](https://www.honeycomb.io/blog/test-span-opentelemetry-collector)
with otlp/http.

Note, this should be run from the root of the contrib repo.

Add/overwrite `local/config.yaml`, `local/span.json`, and run:

```bash
mkdir -p local

cat >local/config.yaml <<EOL
receivers:
  hostmetrics:
    collection_interval: 30s
    initial_delay: 1s
    scrapers:
      load:
  filelog:
    include:
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"foo"}' >> /tmp/otel-test.log
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"bar"}' >> /tmp/otel-test.log
      ## echo '{"timestamp":"2023-12-18 12:00:00","msg":"baz"}' >> /tmp/otel-test.log
      - /tmp/otel-test.log
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.timestamp
          layout: "%Y-%m-%d %H:%M:%S"
  otlp:
    protocols:
      ## curl -i http://localhost:4318/v1/traces -X POST -H "Content-Type: application/json" -d @local/span.json
      http:

processors:
  filter/test:
    metrics:
      metric:
        # Should drop 2 of the 3 metrics, 5m average remains
        - 'name=="system.cpu.load_average.1m"'
        - 'name=="system.cpu.load_average.15m"'
    logs:
      log_record:
        # Should filter out "bar" and "baz"
        - 'IsMatch(body, ".*ba.*")'
    traces:
      span:
        # Should drop 1 of the 2 spans
        - 'name == "foobar"'

exporters:
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 200

service:
  extensions: []
  pipelines:
    metrics:
      receivers: [hostmetrics]
      processors: [filter/test]
      exporters: [debug]
    logs:
      receivers: [filelog]
      processors: [filter/test]
      exporters: [debug]
    traces:
      receivers: [otlp]
      processors: [filter/test]
      exporters: [debug]

  telemetry:
    logs:
      level: debug
    metrics:
      level: detailed
      address: 0.0.0.0:8888
EOL

cat >local/span.json <<EOL
{
  "resourceSpans": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": {
              "stringValue": "test-with-curl"
            }
          }
        ]
      },
      "scopeSpans": [
        {
          "scope": {
            "name": "manual-test"
          },
          "spans": [
            {
              "traceId": "71699b6fe85982c7c8995ea3d9c95df2",
              "spanId": "3c191d03fa8be065",
              "name": "spanitron",
              "kind": 2,
              "droppedAttributesCount": 0,
              "events": [],
              "droppedEventsCount": 0,
              "status": {
                "code": 1
              }
            },
            {
              "traceId": "71699b6fe85982c7c8995ea3d9c95df2",
              "spanId": "2f357b34d32f77b4",
              "name": "foobar",
              "kind": 2,
              "droppedAttributesCount": 0,
              "events": [],
              "droppedEventsCount": 0,
              "status": {
                "code": 1
              }
            }
          ]
        }
      ]
    }
  ]
}
EOL

make run
```

Send some data to the receivers:

```bash
# Write some logs
echo '{"timestamp":"2023-12-18 12:00:00","msg":"foo"}' >> /tmp/otel-test.log
echo '{"timestamp":"2023-12-18 12:00:00","msg":"bar"}' >> /tmp/otel-test.log
echo '{"timestamp":"2023-12-18 12:00:00","msg":"baz"}' >> /tmp/otel-test.log

# Write some spans
curl -i http://localhost:4318/v1/traces -X POST -H "Content-Type: application/json" -d @local/span.json
```

Check the results:

```console
$ curl http://localhost:8888/metrics | grep filtered
# HELP otelcol_processor_filter_datapoints_filtered Number of metric data points dropped by the filter processor
# TYPE otelcol_processor_filter_datapoints_filtered counter
otelcol_processor_filter_datapoints_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 2
# HELP otelcol_processor_filter_logs_filtered Number of logs dropped by the filter processor
# TYPE otelcol_processor_filter_logs_filtered counter
otelcol_processor_filter_logs_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 2
# HELP otelcol_processor_filter_spans_filtered Number of spans dropped by the filter processor
# TYPE otelcol_processor_filter_spans_filtered counter
otelcol_processor_filter_spans_filtered{filter="filter/test",service_instance_id="a99d9078-548b-425f-8466-3e9e2e9bf3b1",service_name="otelcontribcol",service_version="0.91.0-dev"} 1
```

</details>

**Documentation:** I do not believe we document telemetry exposed by
components, but I could add this if needed.

---------

Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
@vainiusd
Copy link

vainiusd commented Jan 11, 2024

Hi, Noticed this only on the v0.92.0 release notes so sorry for the late feedback.
And also thanks, @MacroPower, for adding this!

The new metric names stand out a bit compared to others:

otelcol_receiver_refused_signal
otelcol_processor_refused_signal
otelcol_processor_filter_signal_filtered
otelcol_processor_dropped_signal
otelcol_exporter_enqueue_failed_signal
otelcol_exporter_send_failed_signal

Maybe it would be worth renaming to otelcol_processor_filter_filtered_signal?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contribfest processor/filter Filter processor ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants