Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Lineage metrics in Dataflow streaming are beinng reported as cumulative rather than delta #34052

Closed
17 tasks
rohitsinha54 opened this issue Feb 21, 2025 · 1 comment
Assignees
Labels

Comments

@rohitsinha54
Copy link
Contributor

What happened?

Dataflow streaming metrics are delta metrics unlike batch which are cumulative. This means that in every periodic update Dataflow workers send a delta of metrics from last report.

StringSet metrics (used for lineage tracking) are being reported as cumulative metrics in streaming which causes the following issues:

  • Every periodic (10 seconds) reports took cumulative over and over and reported it hence every report was reporting the metric. Unlike batch job reporting where it filters to only take one which has changed (tracked by dirty bit).
  • Not reseting was using more memory as metrics remained in memory forever
  • In backend it lead to large memory consumption when tracking active workitem counters.

Reporting them as cumulative resets the timestamp of counter in backend. As they get overwritten in every report. This is troublesome because when counters are polled in backend to be dumped to monitoring state store this timestamp is used to determine whether the counter has changed or not hence they get dumped more often than they should be.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@rohitsinha54
Copy link
Contributor Author

.take-issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant