Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emitting Kafka row lag #230

Merged
merged 3 commits into from
Dec 1, 2023
Merged

Emitting Kafka row lag #230

merged 3 commits into from
Dec 1, 2023

Conversation

Tang8330
Copy link
Contributor

@Tang8330 Tang8330 commented Dec 1, 2023

Motivation

Today, we are emitting ingestion lag by diffing the timestamp within the Kafka or Pub/Sub message against the current timestamp.

However, that is the ingestion age, not how long it will take to get to queue 0, where the time to queue 0 is an output from:

  1. Rate of change of the row lag:
    • How many rows are we processing
    • How many rows are we adding
  2. How large is the backlog (row lag)

Visually, it'll look something like this:
image

Today, the ingestion age will just tell you how old is the message you are processing. With this PR, we will soon have the ability to display time to queue 0 within our Analytics Portal.

Changes

  1. We are calculating high watermark by diffing the current message offset along with the Kafka partition's high watermark
  2. We are sending 50% of this data to Datadog such that it will only add minimal latency to data processing time

@Tang8330 Tang8330 changed the title Adding Row Lag Emitting Kafka row lag Dec 1, 2023
@Tang8330 Tang8330 requested a review from nathan-artie December 1, 2023 19:45
@Tang8330 Tang8330 merged commit 9397889 into master Dec 1, 2023
1 check passed
@Tang8330 Tang8330 deleted the row-lag branch December 1, 2023 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants