Skip to content

Latest commit

 

History

History
62 lines (49 loc) · 3.24 KB

README.md

File metadata and controls

62 lines (49 loc) · 3.24 KB

Metrics Dedup Processor

Status
Stability development: metrics
Distributions []
Issues Open issues Closed issues
Code Owners @nicolastakashi

Description

This processor deduplicates metrics by ensuring that only metrics from the in-use replica are retained. It identifies the in-use replica by examining the data points attributes of incoming metrics to find the replica label. For more details, check the Defining in use replica section.

This explanation highlights that the processor inspects the metric resource attributes to determine the replica label.

processors:
  metricsdedup:
    replica_label: replica
    swap_timeout: 5m
Name Description Default
replica_label The label key that identifies the replica. replica
swap_timeout The duration that defines the time window to swap the in-use replica. 1m

Defining in use replica

The logic used to determine the in-use replica is inspired by Cortex component but adopts a simpler approach to avoid the need for distributed consensus on day zero.

Below is the sequence diagram that describes the logic used to determine the in-use replica:

sequenceDiagram
    participant Processor
    participant Receiver
    participant ReplicaInfo

    Receiver->>Processor: Send metric with replica
    Processor->>ReplicaInfo: Load current replica info
    alt No current replica
        Processor->>ReplicaInfo: Set new replica and update timestamp
    else Current replica exists
        alt Replica matches current replica
            Processor->>ReplicaInfo: Update timestamp
        else Replica different from current replica
            alt Timestamp older than swap timeout
                Processor->>ReplicaInfo: Set new replica and update timestamp
            else Timestamp within swap timeout
                Processor->>ReplicaInfo: Retain current replica
            end
        end
    end
    Processor->>Receiver: Return in-use replica
Loading

This approach ensures that only metrics from the in-use replica are processed, reducing the number of metrics processed and stored.