Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with the processor/transform - leaking resources / attributes #34715

Closed
evilr00t opened this issue Aug 16, 2024 · 3 comments
Closed

Issue with the processor/transform - leaking resources / attributes #34715

evilr00t opened this issue Aug 16, 2024 · 3 comments
Labels
discussion needed Community discussion needed processor/transform Transform processor waiting for author

Comments

@evilr00t
Copy link

evilr00t commented Aug 16, 2024

Component(s)

processor/transform

What happened?

Description

Doing the transform some of the resources are being improperly set, using values from completely different log message.

Steps to Reproduce

I use journald logging for Docker with image tagging:

--log-driver journald --log-opt tag={{.Name}}|{{.ImageName}}

OTEL config:

  transform/logs:
    error_mode: ignore
    log_statements:
    - context: log
      statements:
      - set(severity_number, SEVERITY_NUMBER_DEBUG) where Int(body["PRIORITY"]) ==
        7
      - set(severity_number, SEVERITY_NUMBER_INFO) where Int(body["PRIORITY"]) ==
        6
      - set(severity_number, SEVERITY_NUMBER_INFO2) where Int(body["PRIORITY"]) ==
        5
      - set(severity_number, SEVERITY_NUMBER_WARN) where Int(body["PRIORITY"]) ==
        4
      - set(severity_number, SEVERITY_NUMBER_ERROR) where Int(body["PRIORITY"]) ==
        3
      - set(severity_number, SEVERITY_NUMBER_FATAL) where Int(body["PRIORITY"]) <=
        2
      - set(attributes["priority"], body["PRIORITY"])
      - set(attributes["process.comm"], body["_COMM"])
      - set(attributes["process.exec"], body["_EXE"])
      - set(attributes["process.uid"], body["_UID"])
      - set(attributes["process.gid"], body["_GID"])
      - set(attributes["owner_uid"], body["_SYSTEMD_OWNER_UID"])
      - set(attributes["unit"], body["_SYSTEMD_UNIT"])
      - set(attributes["syslog_identifier"], body["SYSLOG_IDENTIFIER"])
      - set(attributes["syslog_identifier_prefix"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(attributes["syslog_identifier_prefix"], "^[^a-zA-Z]*([a-zA-Z]{3,25}).*",
        "$$1") where body["SYSLOG_IDENTIFIER"] != nil
      - set(attributes["unit_prefix"], ConvertCase(body["_SYSTEMD_UNIT"], "lower"))
        where body["_SYSTEMD_UNIT"] != nil
      - replace_pattern(attributes["unit_prefix"], "^[^a-zA-Z]*([a-zA-Z]{3,25}).*",
        "$$1") where body["_SYSTEMD_UNIT"] != nil
      - set(attributes["job"], attributes["syslog_identifier_prefix"])
      - set(attributes["job"], attributes["unit_prefix"]) where attributes["job"]
        == nil and attributes["unit_prefix"] != nil
      - set(resource.attributes["aws_account"],"social360")
      - set(resource.attributes["service.name"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["service.name"], "^([^-]*-[^-]*).*", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(resource.attributes["docker.image"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["docker.image"], ".*\\|(.*)$", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(resource.attributes["container.name"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["container.name"], "^(.*)\\|.*", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(body, body["MESSAGE"])

Expected Result

docker_image & container name should be used from syslog_identifier but they have completely different values, replication is gone and api is used which is separate container and shouldn't be here?

Actual Result

Screenshot 2024-08-16 at 10 46 09 AM

It looks like some resources are leaked? This shouldn't happen...

Collector version

0.106.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

root@75:/etc/otelcol-contrib# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04 LTS
Release:        24.04
Codename:       noble

OpenTelemetry Collector configuration

---
receivers:
  docker_stats:
    container_labels_to_metric_labels:
      org.opencontainers.image.source: org.opencontainers.image.source
      com.docker.compose.project: service.name
    metrics:
      container.uptime:
        enabled: true
      container.restarts:
        enabled: true
  hostmetrics:
    scrapers:
      cpu:
        metrics:
          system.cpu.logical.count:
            enabled: true
          system.cpu.utilization:
            enabled: true
      memory:
        metrics:
          system.memory.utilization:
            enabled: true
          system.memory.limit:
            enabled: true
      filesystem:
        exclude_fs_types:
          fs_types:
          - squashfs
          - vfat
          match_type: strict
        metrics:
          system.filesystem.utilization:
            enabled: true
      network: {}
      load: {}
      disk: {}
      paging: {}
  journald:
    units:
    - ssh
    - systemd
    - docker
    - containerd
    priority: info
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  prometheus:
    config:
      scrape_configs:
      - job_name: otel-collector
        scrape_interval: 10s
        static_configs:
        - targets:
          - localhost:8888
service:
  pipelines:
    metrics:
      receivers:
      - docker_stats
      - hostmetrics
      - otlp
      - prometheus
      processors:
      - batch
      - resourcedetection
      exporters:
      - prometheusremotewrite/thanos
    logs:
      receivers:
      - journald
      processors:
      - batch
      - resourcedetection
      - transform/logs
      exporters:
      - otlphttp/loki
    traces:
      receivers:
      - otlp
      processors:
      - batch
      - resourcedetection
      exporters:
      - otlp
  telemetry:
    logs: {}
    metrics:
      level: basic
      address: ":8888"
  extensions:
  - basicauth/jaeger
  - basicauth/loki
  - basicauth/thanos
processors:
  batch: {}
  resourcedetection:
    detectors:
    - system
    - env
    system:
      resource_attributes:
        host.name:
          enabled: true
    transform/logs:
    error_mode: ignore
    log_statements:
    - context: log
      statements:
      - set(severity_number, SEVERITY_NUMBER_DEBUG) where Int(body["PRIORITY"]) ==
        7
      - set(severity_number, SEVERITY_NUMBER_INFO) where Int(body["PRIORITY"]) ==
        6
      - set(severity_number, SEVERITY_NUMBER_INFO2) where Int(body["PRIORITY"]) ==
        5
      - set(severity_number, SEVERITY_NUMBER_WARN) where Int(body["PRIORITY"]) ==
        4
      - set(severity_number, SEVERITY_NUMBER_ERROR) where Int(body["PRIORITY"]) ==
        3
      - set(severity_number, SEVERITY_NUMBER_FATAL) where Int(body["PRIORITY"]) <=
        2
      - set(attributes["priority"], body["PRIORITY"])
      - set(attributes["process.comm"], body["_COMM"])
      - set(attributes["process.exec"], body["_EXE"])
      - set(attributes["process.uid"], body["_UID"])
      - set(attributes["process.gid"], body["_GID"])
      - set(attributes["owner_uid"], body["_SYSTEMD_OWNER_UID"])
      - set(attributes["unit"], body["_SYSTEMD_UNIT"])
      - set(attributes["syslog_identifier"], body["SYSLOG_IDENTIFIER"])
      - set(attributes["syslog_identifier_prefix"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(attributes["syslog_identifier_prefix"], "^[^a-zA-Z]*([a-zA-Z]{3,25}).*",
        "$$1") where body["SYSLOG_IDENTIFIER"] != nil
      - set(attributes["unit_prefix"], ConvertCase(body["_SYSTEMD_UNIT"], "lower"))
        where body["_SYSTEMD_UNIT"] != nil
      - replace_pattern(attributes["unit_prefix"], "^[^a-zA-Z]*([a-zA-Z]{3,25}).*",
        "$$1") where body["_SYSTEMD_UNIT"] != nil
      - set(attributes["job"], attributes["syslog_identifier_prefix"])
      - set(attributes["job"], attributes["unit_prefix"]) where attributes["job"]
        == nil and attributes["unit_prefix"] != nil
      - set(resource.attributes["aws_account"],"social360")
      - set(resource.attributes["service.name"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["service.name"], "^([^-]*-[^-]*).*", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(resource.attributes["docker.image"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["docker.image"], ".*\\|(.*)$", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(resource.attributes["container.name"], ConvertCase(body["SYSLOG_IDENTIFIER"],
        "lower")) where body["SYSLOG_IDENTIFIER"] != nil
      - replace_pattern(resource.attributes["container.name"], "^(.*)\\|.*", "$$1")
        where body["SYSLOG_IDENTIFIER"] != nil
      - set(body, body["MESSAGE"])
exporters:
  otlp:
    endpoint: FOOBAR
    headers:
      Content-Type: application/grpc
    auth:
      authenticator: basicauth/jaeger
  otlphttp/loki:
    endpoint: FOOBAR
    auth:
      authenticator: basicauth/loki
  prometheusremotewrite/thanos:
    endpoint: FOOBAR
    auth:
      authenticator: basicauth/thanos
    target_info:
      enabled: false
    add_metric_suffixes: false
    resource_to_telemetry_conversion:
      enabled: true
    external_labels:
      social360: 'true'
extensions:
  basicauth/jaeger:
    client_auth:
  basicauth/loki:
    client_auth:
  basicauth/thanos:
    client_auth:

Log output

No response

Additional context

No response

@evilr00t evilr00t added bug Something isn't working needs triage New item requiring triage labels Aug 16, 2024
@github-actions github-actions bot added the processor/transform Transform processor label Aug 16, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth
Copy link
Member

@evilr00t this is likely a duplicate of #32080. Can you try enabling the transform.flatten.logs feature gate and then setting flatten_data: true in the transformprocessor config?

Details here: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor#transformflattenlogs

@TylerHelmuth TylerHelmuth added waiting for author discussion needed Community discussion needed and removed needs triage New item requiring triage bug Something isn't working labels Aug 16, 2024
@evilr00t
Copy link
Author

evilr00t commented Aug 19, 2024

@TylerHelmuth Thank you for the info, I've enabled feature flag and I'm checking logs now, will let you know soon if that helped.

EDIT: logs are consistent now, thank you once again @TylerHelmuth !

P.S. I was looking for similar issues but didn't know problem was with set() - thought I'd report it as the processor/transform 👍

@evilr00t evilr00t closed this as not planned Won't fix, can't repro, duplicate, stale Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion needed Community discussion needed processor/transform Transform processor waiting for author
Projects
None yet
Development

No branches or pull requests

2 participants