Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify specification of ConsumerID for Kafka Pub/Sub Component #3697

Closed
KrylixZA opened this issue Aug 27, 2023 · 4 comments · Fixed by #3935
Closed

Clarify specification of ConsumerID for Kafka Pub/Sub Component #3697

KrylixZA opened this issue Aug 27, 2023 · 4 comments · Fixed by #3935
Assignees
Labels
content/incorrect-information Content in the docs is incorrect
Milestone

Comments

@KrylixZA
Copy link
Contributor

Describe the issue
Please could the Dapr documentation around the specification of ConsumerID in the Kafka Pub/Sub component be clarified. According to the documentation, I am able to set ConsumerID to {podName}, {appId}, {uuid}, or {namespace}. However, when utilising this property, no tangible difference is made to the consumer identifier as reflected in Kafka. According to the Components Contrib Kafka metadata.yaml file, it appears that setting ConsumerID is not supported.

I am using Confluent Cloud and all my pods simply reflect as some implementation {clientId}-{uuid}.

URL of the docs
https://docs.dapr.io/reference/components-reference/supported-pubsub/setup-apache-kafka/
Also clearly referenced in: https://docs.dapr.io/developing-applications/building-blocks/pubsub/howto-subscribe-statefulset/

Expected content
In my case, I am using Confluent Cloud Kafka, deploying StatefulSets into AKS, using the following configuration in the Pub/Sub component:

- name: consumerGroup
  value: "{appID}"
- name: consumerID
  value: "{podName}"
- name: clientID 
  value: "{appID}"

With the above configuration, I expect to see consumers reflect as:

my-dapr-app-0
my-dapr-app-1
my-dapr-app-2
...

But I simply see something as follows:

my-dapr-app-{uuid}
my-dapr-app-{uuid}
my-dapr-app-{uuid}
...

Given that the Kafka metadata.yaml appears not to reference ConsumerID, I think the docs should be clarified that this is not/no longer supported.

Screenshots
Unfortunately don't have any I can share publicly. But the above expectations should suffice.

Additional context
https://github.com/dapr/components-contrib/blob/master/pubsub/kafka/metadata.yaml

If this is supposed to be supported and actually a bug in the Kafka consumer, please LMK and I'll report the issue against the Components Contrib repo instead.

@KrylixZA KrylixZA added the content/incorrect-information Content in the docs is incorrect label Aug 27, 2023
@olitomlinson
Copy link
Contributor

olitomlinson commented Aug 29, 2023

@KrylixZA

Given this code, the consumerId property is only used when the consumerGroup property is NOT set.


my app is called workflow and in my kafka component I set to the following :

image

which creates a consumer group called .workflow

image

and creates consumers in the group, as follows :

image

__

I do agree that the docs on the kafka spec should specify that consumerId is a legacy property and is only used when consumerGroup is not specified.

@KrylixZA
Copy link
Contributor Author

Hey @olitomlinson

Thanks for this. Will give it a go.

Setting only the consumerID creates a fan out behaviour which also isn't well documented. The behaviour is only mentioned in the merge request that brought in the consumerID as a configurable property.

Thinking about it a little, being able to set consumerID regardless of any other configuration seems like something somebody may want to do. If for nothing else, it makes it easier to line up consumers in Kafka with the running app when it's set to '{podName}'. Although that's probably only an issue while getting an application running in production initially. Other than that, it's seldom going to be a problem.

@msfussell
Copy link
Member

@KrylixZA - Interested in your conclusion here. Are you using ConsumerGroup now?

@KrylixZA
Copy link
Contributor Author

KrylixZA commented Sep 13, 2023

Hey @msfussell, we stuck with what is configured above. As mentioned, the consumerId is still not being set but that is less relevant than I initially thought - unless you're using StatefulSets maybe 🤔

What drove me to list this issue in the beginning was the fact that I couldn't always get the number of consumers on a given Kafka topic to match the number of pods running in the deployment. Ultimately it turned out to be that for services that consume from multiple topics, the Kafka consumer group is shared across all those topics and there is no guarantee about how those consumers will be distributed across those topics.

The specifics to my scenario was:

  • 2 topics; one with 30 partitions, one with 10 partitions.
  • running 40 pods as replica sets (disabled HPA to test).
  • wasn't seeing 30 pods on the one topic and 10 on the other but rather some random allocation to both and often times pods not consuming at all.

The solution to my problem was to configure multiple Dapr Pub/Sub components per topic I was consuming from. Listed the problem in this issue #3707.

Ultimately what I would like to see come out of this issue is the ability to still set the consumerID to {podName} regardless of whether clientID or consumerGroup is defined. That would probably involve changes to the runtime and also doc updates, so maybe this could become two issues?
It makes sense from an operations perspective to be able tell which pods are which clients in the scenario that there is high consumer lag, etc. and human intervention is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content/incorrect-information Content in the docs is incorrect
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants