Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for fetching consumer offsets stored in Kafka to monitor_unlisted_consumer_groups #3957

Merged
merged 5 commits into from
Oct 1, 2019
Merged

Add support for fetching consumer offsets stored in Kafka to monitor_unlisted_consumer_groups #3957

merged 5 commits into from
Oct 1, 2019

Conversation

jeffwidman
Copy link
Contributor

@jeffwidman jeffwidman commented Jun 20, 2019

Add support for fetching consumer offsets stored in Kafka to monitor_unlisted_consumer_groups

Major refactor of kafka_consumer check with a new code path that will exist alongside the legacy code path. The new code path:

  1. Eliminates legacy cruft:
    • drops support for fetching consumer offsets from zookeeper
    • drops support for brokers < 0.10.2 which do not support fetching all known offsets for a consumer group
    • migrates from the hand-rolled group-coordinator lookup code, retries, etc to relying on kafka-python's KafkaAdminClient to handle all that under the covers
  2. Adds support for monitor_unlisted_consumer_groups to fetch consumer offsets from Kafka
  3. Refactors from sync to async using callbacks for improved performance, especially with larger clusters and more consumer groups.

To clarify: Fetching consumer offsets from older brokers and from zookeeper is still supported, it just uses the legacy code path instead of this new code path.

Motivation

This has been under discussion in various forms for over two years, see #423 and #2730. The main blocker was coming up with a clean way to support legacy features like fetching from Zookeeper / older brokers while simultaneously migrating to kafka-python's new KafkaAdminClient which only supports brokers >= 0.10.0. After several rounds of discussion, the decision was to move the legacy code into a separate file and start from scratch with a new implementation of the check which would be the primary one for new features moving forward.

Status: Working

The remaining TODO's mentioned in the code are small refinements that I am unlikely to have time to do, but wanted to still note for later developers.

This depends on unreleased code within kafka-python, so can't be merged until kafka-python releases 1.4.7 (I'm on the maintainer team, so will help make that happen).

My personal TODO list:

@stale
Copy link

stale bot commented Jul 29, 2019

This issue has been automatically marked as stale because it has not had activity in the last 30 days. Note that the issue will not be automatically closed, but this notification will remind us to investigate why there's been inactivity. Thank you for participating in the Datadog open source community.

@stale stale bot added stale and removed stale labels Jul 29, 2019
@jeffwidman jeffwidman changed the title WIP Add support for fetching consumer offsets stored in Kafka to monitor_unlisted_consumer_groups Aug 5, 2019
ofek pushed a commit that referenced this pull request Aug 5, 2019
Agent v6 only has one instance per instances, so we don't need to have a
separate cache... just cache the client on the instance.

This also lets us be sure we are closing the client at the end of the
check to avoid stale connections on older kafka brokers.

This cleanup will make life simpler when the check is refactored as part
of #3957.
@jeffwidman jeffwidman requested a review from a team as a code owner August 15, 2019 18:58
Copy link
Contributor

@ruthnaebeck ruthnaebeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs review

This is a major refactor of `kafka_consumer` check with a new code path
that will exist alongside the legacy code path. The new code path:
1. Eliminates legacy cruft:
   * drops support for fetching consumer offsets from zookeeper
   * drops support for brokers < 0.10.2 which do not support fetching
     all known offsets for a consumer group
   * migrates from the hand-rolled group-coordinator lookup code,
     retries, etc to relying on  `kafka-python`'s `KafkaAdminClient` to
     handle all that under the covers
2. Adds support for `monitor_unlisted_consumer_groups` to fetch consumer
   offsets from Kafka
3. Refactors from sync to async using callbacks for improved
  performance, especially with larger clusters and more consumer groups.

To clarify: Fetching consumer offsets from older brokers and from
zookeeper is still supported, it just uses the legacy code path instead
of this new code path.

This has been under discussion in various forms for over two years, see
#423 and
#2730. The main blocker
was coming up with a clean way to support legacy features like fetching
from Zookeeper / older brokers while simultaneously migrating to
`kafka-python`'s new `KafkaAdminClient` which only supports
brokers >= `0.10.0`. After several rounds of discussion, the decision
was to move the legacy code into a separate file and start from scratch
with a new implementation of the check which would be the primary one
for new features moving forward.
6 similar comments
@codecov
Copy link

codecov bot commented Aug 23, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@e795593). Click here to learn what that means.
The diff coverage is 65.49%.

Impacted Files Coverage Δ
...er/datadog_checks/kafka_consumer/kafka_consumer.py 66.47% <65.49%> (ø)

ruthnaebeck
ruthnaebeck previously approved these changes Aug 23, 2019
Copy link
Contributor

@ruthnaebeck ruthnaebeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for docs

@ofek
Copy link
Contributor

ofek commented Sep 19, 2019

Blocked on dpkp/kafka-python#1903

@jeffwidman jeffwidman requested review from a team as code owners October 1, 2019 04:49
@ofek ofek mentioned this pull request Oct 1, 2019
6 tasks
Copy link
Contributor

@ofek ofek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

Copy link
Contributor

@ruthnaebeck ruthnaebeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for docs

@ofek ofek merged commit dcf574d into DataDog:master Oct 1, 2019
@jeffwidman jeffwidman deleted the cleanup-kafka-consumer-offset-check-and-add-kafka-support-for-monitor_unlisted_consumer_groups branch October 3, 2019 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants