[jira] [Updated] (KAFKA-18191) StreamJoined name is not used for processor names

Matthias J. Sax (Jira) Mon, 08 Sep 2025 17:46:42 -0700


     [ 
https://issues.apache.org/jira/browse/KAFKA-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matthias J. Sax updated KAFKA-18191:
------------------------------------
    Description: 
The StreamJoined#as API allows you to set a name for a stream-stream join 
operator. The intention is to allow one to name the stores, and therefore 
changelogs, resulting in an upgradeable topology, but it is a bit strange that 
the name isn't also used for the processors themselves. (Based on KIP-

Of course, the stream-stream join is a bit of an edge case compared to, say, a 
count or filter operator, where the operator is a 1:1 mapping to the processor 
node and the user can name the processor exactly, because the stream-stream 
join operator actually results in multiple processors which each need a unique 
name. However, we could at least use the specified StreamJoined name as the 
basis for the resulting processor names, to avoid getting stuck with names like 
"KSTREAM-JOINTHIS-0000000004" and "KSTREAM-WINDOWED-0000000003" which are 
difficult to interpret and make it hard to read a topology

Note that there is some existing precedent for this: for example with cogroups, 
the individual processors inherit the base name of the cogroup's aggregate 
operator name.  For example this code

 
{code:java}
grouped1
.cogroup((k, v, a) -> a + v) // wrapped 1
.cogroup(grouped2, (k, v, a) -> a + v) // wrapped 2
.aggregate(() -> "", Named.as("myName"), Materialized.as("store")) {code}
 
produces processors with these names: "myName-cogroup-agg-0", 
"myName-cogroup-agg-1", "myName-cogroup-merge"

  was:
The StreamJoined#as API allows you to set a name for a stream-stream join 
operator. The intention is to allow one to name the stores, and therefore 
changelogs, resulting in an upgradeable topology, but it is a bit strange that 
the name isn't also used for the processors themselves.

Of course, the stream-stream join is a bit of an edge case compared to, say, a 
count or filter operator, where the operator is a 1:1 mapping to the processor 
node and the user can name the processor exactly, because the stream-stream 
join operator actually results in multiple processors which each need a unique 
name. However, we could at least use the specified StreamJoined name as the 
basis for the resulting processor names, to avoid getting stuck with names like 
"KSTREAM-JOINTHIS-0000000004" and "KSTREAM-WINDOWED-0000000003" which are 
difficult to interpret and make it hard to read a topology

Note that there is some existing precedent for this: for example with cogroups, 
the individual processors inherit the base name of the cogroup's aggregate 
operator name.  For example this code

 
{code:java}
grouped1
.cogroup((k, v, a) -> a + v) // wrapped 1
.cogroup(grouped2, (k, v, a) -> a + v) // wrapped 2
.aggregate(() -> "", Named.as("myName"), Materialized.as("store")) {code}
 
produces processors with these names: "myName-cogroup-agg-0", 
"myName-cogroup-agg-1", "myName-cogroup-merge"


> StreamJoined name is not used for processor names
> -------------------------------------------------
>
>                 Key: KAFKA-18191
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18191
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Assignee: Nikita Shupletsov
>            Priority: Minor
>              Labels: needs-kip
>
> The StreamJoined#as API allows you to set a name for a stream-stream join 
> operator. The intention is to allow one to name the stores, and therefore 
> changelogs, resulting in an upgradeable topology, but it is a bit strange 
> that the name isn't also used for the processors themselves. (Based on KIP-
> Of course, the stream-stream join is a bit of an edge case compared to, say, 
> a count or filter operator, where the operator is a 1:1 mapping to the 
> processor node and the user can name the processor exactly, because the 
> stream-stream join operator actually results in multiple processors which 
> each need a unique name. However, we could at least use the specified 
> StreamJoined name as the basis for the resulting processor names, to avoid 
> getting stuck with names like "KSTREAM-JOINTHIS-0000000004" and 
> "KSTREAM-WINDOWED-0000000003" which are difficult to interpret and make it 
> hard to read a topology
> Note that there is some existing precedent for this: for example with 
> cogroups, the individual processors inherit the base name of the cogroup's 
> aggregate operator name.  For example this code
>  
> {code:java}
> grouped1
> .cogroup((k, v, a) -> a + v) // wrapped 1
> .cogroup(grouped2, (k, v, a) -> a + v) // wrapped 2
> .aggregate(() -> "", Named.as("myName"), Materialized.as("store")) {code}
>  
> produces processors with these names: "myName-cogroup-agg-0", 
> "myName-cogroup-agg-1", "myName-cogroup-merge"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-18191) StreamJoined name is not used for processor names

Reply via email to