[
https://issues.apache.org/jira/browse/KAFKA-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744726#comment-17744726
]
Matthias J. Sax commented on KAFKA-15190:
-----------------------------------------
{quote}but although {{StreamsPartitionAssignor}} sometimes calls it a client ID
and sometimes a process ID it's a {{UUID}} so I assume it really is the process
ID.
{quote}
Thanks for calling this out. You are right; I missed this point.
As you did mention "max recovery lag", I assume you have a stateful app that
uses in-memory stores only?
Another thing coming to my mind: the `client.id` has actually different purpose
and should not be unique per `KafkaStreams` instance, but should be the _same_
for all instances (the name is a little bit mis-leading). For example, if you
configure quotas, it's based on `client.id` and you usually want quotas to be
set per application, not per instance.
> Allow configuring a streams process ID
> --------------------------------------
>
> Key: KAFKA-15190
> URL: https://issues.apache.org/jira/browse/KAFKA-15190
> Project: Kafka
> Issue Type: Wish
> Components: streams
> Reporter: Joe Wreschnig
> Priority: Major
> Labels: needs-kip
>
> We run our Kafka Streams applications in containers with no persistent
> storage, and therefore the mitigation of persisting process ID the state
> directly in KAFKA-10716 does not help us avoid shuffling lots of tasks during
> restarts.
> However, we do have a persistent container ID (from a Kubernetes
> StatefulSet). Would it be possible to expose a configuration option to let us
> set the streams process ID ourselves?
> We are already using this ID as our group.instance.id - would it make sense
> to have the process ID be automatically derived from this (plus
> application/client IDs) if it's set? The two IDs seem to have overlapping
> goals of identifying "this consumer" across restarts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)