Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap consumer.poll() for KafkaConsumer iteration #1902

Merged
merged 3 commits into from
Sep 29, 2019

Conversation

dpkp
Copy link
Owner

@dpkp dpkp commented Sep 16, 2019

Now that heartbeats have been moved to a background thread, the pythonic iterator can just wrap consumer.poll() and no longer needs to worry about breaking from iteration to support sending background requests. With this approach, some consumers may need to tweak max_poll_records and/or max_poll_interval_ms, such that consumer's average per-record processing time * max_poll_records < max_poll_interval_ms. Defaults remain at 500 max_poll_records and 5mins for max_poll_interval. This should be sufficient for the vast majority of consumers.

My local testing suggests that this change fixes consumer performance issues reported in #1888 .

Given the reduction in overall complexity, this change may also improve #1672 (though won't completely fix).

Because this is a relatively significant change to internals, I have added a configuration option to allow reverting back to the <= 1.4.6 iterator logic:

consumer = KafkaConsumer(bootstrap_servers=[...], legacy_iterator=True, ...)

This change is Reviewable

@dpkp dpkp mentioned this pull request Sep 18, 2019
Copy link
Collaborator

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Source code change looks relatively straightforward to me.

I question whether adding a feature flag is worth the trouble unless you truly think there's a significant chance it will need to be reverted. Folks can always revert to 1.4.6 if needed, although I suppose 1.4.7 will include a number of other changes so this makes it simpler to isolate the cause of problems.

Looks like Travis is failing, may be related to #1891 and https://github.com/dpkp/kafka-python/pull/1886/files#r316860737.

@dpkp
Copy link
Owner Author

dpkp commented Sep 28, 2019

I question whether adding a feature flag is worth the trouble unless you truly think there's a significant chance it will need to be reverted. Folks can always revert to 1.4.6 if needed, although I suppose 1.4.7 will include a number of other changes so this makes it simpler to isolate the cause of problems.

Ok, I'm convinced -- I updated to make the new poll()-based iterator the default. If there is a problem, users can revert to prior implementation by setting legacy_iterator=True in KafkaConsumer.

Looks like Travis is failing, may be related to #1891 and https://github.com/dpkp/kafka-python/pull/1886/files#r316860737.

I think this was related to deciding whether to update offsets in the fetcher (we need to do this for standard poll(), but not when wrapping with the iterator). I've pushed a fix and will see whether the next test run is clean.

Thanks for the review!

@dpkp dpkp force-pushed the consumer_iterator_with_poll branch from 2997a57 to 615c57b Compare September 28, 2019 23:33
@dpkp dpkp force-pushed the consumer_iterator_with_poll branch 2 times, most recently from 854202c to c444812 Compare September 29, 2019 00:45
@dpkp dpkp force-pushed the consumer_iterator_with_poll branch from c444812 to 7c5cdff Compare September 29, 2019 01:22
@dpkp dpkp merged commit 5d1d424 into master Sep 29, 2019
@dpkp dpkp deleted the consumer_iterator_with_poll branch September 29, 2019 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants