Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: Metrics DeDuplicator processor #20921

Closed
2 tasks
nicolastakashi opened this issue Apr 17, 2023 · 10 comments
Closed
2 tasks

New component: Metrics DeDuplicator processor #20921

nicolastakashi opened this issue Apr 17, 2023 · 10 comments
Labels

Comments

@nicolastakashi
Copy link
Contributor

The purpose and use-cases of the new component

Engineers may want to set up a highly available environment to collect metrics by having pairs of data sources collecting the same set of targets and then sending the samples to the OpenTelemetry collector to enrich and export the data to the metrics backend (e.g Thanos).

Although many of the metrics backends provide support for metrics deduplication on the query runtime, it requires customers to store duplicate samples which requires more computing resources such as storage or it may increase their bills if using a vendor backend (depending on the vendor pricing strategy). With that in mind would be amazing if the OpenTelemtry Collector provided some deduplication processor.

This processor may work based on some existing implementation like Thanos, Grafana Mimir, Cortex, or any other valuable implementation.

Example configuration for the component

metricsdeduplicator:
  replica_label: 
    - prometheus_replica

Telemetry data types supported

Metrics

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute this as a representative of the vendor.

Sponsor (optional)

No response

Additional context

No response

@nicolastakashi nicolastakashi added the needs triage New item requiring triage label Apr 17, 2023
@nicolastakashi
Copy link
Contributor Author

There's one issue #17874 asking a processor to deduplicate traces, maybe we can join efforts, don't know.

@atoulme atoulme added Sponsor Needed New component seeking sponsor and removed needs triage New item requiring triage labels Apr 17, 2023
@atoulme
Copy link
Contributor

atoulme commented Apr 17, 2023

Would you please expand on the use case? I am not sure you can safely in parallel scrape a Prometheus metrics endpoint since it is stateful. The approach I have heard folks tend to take is to use a leader election mechanism to pick a specific collector to perform collection, with a failover.

@nicolastakashi
Copy link
Contributor Author

Indeed most of the current implementations are leveraging only one instance to deduplicate metrics, this could be an initial state, like ensuring this exporter is not running in multiple replicas.

But exploring feature implementations we may have a similar strategy as the existing one to load balancer traces based on the trace id, we may discuss the idea of how to load balance metrics between collectors.

This processor is more suitable to run on central collectors which are receiving metrics from other agents, so it's easier to ensure only one replica running or feature lb implementation.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions github-actions bot added the Stale label Jun 19, 2023
@nicolastakashi
Copy link
Contributor Author

I still think it's relevant

@atoulme atoulme removed the Stale label Jun 27, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@github-actions
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2023
@nicolastakashi
Copy link
Contributor Author

nicolastakashi commented Jul 15, 2024

Bringing this subject back!
I've been testing some implementation available in this repo: https://github.com/nicolastakashi/metricsdedupprocessor

I'd like to get feedback around that, can we reopen the issue @atoulme ?

@mx-psi mx-psi removed the Stale label Jul 17, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants