[jira] [Commented] (KAFKA-15841) Add Support for Topic-Level Partitioning in Kafka Connect

Greg Harris (Jira) Wed, 14 Feb 2024 12:35:04 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-15841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817522#comment-17817522
 ]


Greg Harris commented on KAFKA-15841:
-------------------------------------

Hi [~henriquemota]!

Can you share more details about your setup? Do you have a single JDBC 
connector, or multiple in the same cluster? Are connectors configured to read 
from one topic, or multiple? What is your tasks.max configuration, and what 
consumer assignor are you using?

Consumers support distributing reading from multiple topics/topic-partitions 
across consumers in a consumer group, and Connect supports distributing 
multiple tasks across multiple workers, both of which allow you to map multiple 
topic-partitions of work onto multiple workers.

For example, say you have N topic-paritions, and M connect workers, with N > M. 
Any one of the following could be used to distribute the work:
1. You could run one-connector-per-topic, and configure each connector with 
tasks.max=1, and have the worker distribute the N tasks across M workers.
2. You could add all N topics to a single connector with tasks.max=M, and have 
the consumer group distribute the N topic-partitions among those M tasks.
3. You could manually group the N topics into M groups, and create M connectors 
with tasks.max=1, giving each connector one group of topics.

> Add Support for Topic-Level Partitioning in Kafka Connect
> ---------------------------------------------------------
>
>                 Key: KAFKA-15841
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15841
>             Project: Kafka
>          Issue Type: Improvement
>          Components: connect
>            Reporter: Henrique Mota
>            Priority: Trivial
>
> In our organization, we utilize JDBC sink connectors to consume data from 
> various topics, where each topic is dedicated to a specific tenant with a 
> single partition. Recently, we developed a custom sink based on the standard 
> JDBC sink, enabling us to pause consumption of a topic when encountering 
> problematic records.
> However, we face limitations within Kafka Connect, as it doesn't allow for 
> appropriate partitioning of topics among workers. We attempted a workaround 
> by breaking down the topics list within the 'topics' parameter. 
> Unfortunately, Kafka Connect overrides this parameter after invoking the 
> {{taskConfigs(int maxTasks)}} method from the 
> {{org.apache.kafka.connect.connector.Connector}} class.
> We request the addition of support in Kafka Connect to enable the 
> partitioning of topics among workers without requiring a fork. This 
> enhancement would facilitate better load distribution and allow for more 
> flexible configurations, particularly in scenarios where topics are dedicated 
> to different tenants.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15841) Add Support for Topic-Level Partitioning in Kafka Connect

Reply via email to