[
https://issues.apache.org/jira/browse/KAFKA-16622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842117#comment-17842117
]
Greg Harris commented on KAFKA-16622:
-------------------------------------
Yeah [~ecomar] from the final state of the OffsetSyncStore, this appears to be
working as intended:
{noformat}
[2024-04-26 10:58:44,557] TRACE [MirrorCheckpointConnector|task-0] New sync
OffsetSync{topicPartition=mytopic-0, upstreamOffset=19998,
downstreamOffset=19998} applied, new state is
[19998:19998,19987:19987,19965:19965,19921:19921,19822:19822,19635:19635,19415:19415,18964:18964,18095:18095,16500:16500,9999:9999]
(org.apache.kafka.connect.mirror.OffsetSyncStore:176){noformat}
The gaps are 11, 22, 44, 99, 187, 220, 451, 869, 1595 which follow the
approximate exponential that I would expect. Instead of the ~5 syncs I expected
there's 9, which is better than I estimated because you have the offset.lag.max
low.
I would say the title of this issue isn't quite accurate now that we've
investigated it, as the translation can happen at these intermediate points in
addition to the end of the topic. If you had a consumer group with offset 19635
or 19636, that would be translated exactly, but a consumer group with offset
19700 would translate to 19636 and have some lag/reprocessing. This is
intentional, as we made a trade-off about memory usage and precision to
prioritize accuracy in the offset translation algorithm. You can see the
discussion about this here:
[https://lists.apache.org/thread/7qzxm1727y8rtrw6ds7t6hltkm55j5po] and here:
[https://github.com/apache/kafka/pull/13178] to see more motivation for the
current algorithm.
I understand your concern though, and you're correct that KAFKA-15905 will help
for the offsets between 0 and 9999, and KAFKA-16364 will help with offsets
close to the end of the topic.
I also just opened this ticket:
https://issues.apache.org/jira/browse/KAFKA-16641 for another improvement I
thought of. It has a risk of mis-translating offsets for topics with gaps, but
should be better than the old pre KAFKA-12468 algorithm, so we can discuss if
it requires a configuration, and maybe it can be included in a KIP with
KAFKA-16364.
> Mirromaker2 first Checkpoint not emitted until consumer group fully catches
> up once
> -----------------------------------------------------------------------------------
>
> Key: KAFKA-16622
> URL: https://issues.apache.org/jira/browse/KAFKA-16622
> Project: Kafka
> Issue Type: Bug
> Components: mirrormaker
> Affects Versions: 3.7.0, 3.6.2, 3.8.0
> Reporter: Edoardo Comar
> Priority: Major
> Attachments: connect.log.2024-04-26-10.zip,
> edo-connect-mirror-maker-sourcetarget.properties
>
>
> We observed an excessively delayed emission of the MM2 Checkpoint record.
> It only gets created when the source consumer reaches the end of a topic.
> This does not seem reasonable.
> In a very simple setup :
> Tested with a standalone single process MirrorMaker2 mirroring between two
> single-node kafka clusters(mirromaker config attached) with quick refresh
> intervals (eg 5 sec) and a small offset.lag.max (eg 10)
> create a single topic in the source cluster
> produce data to it (e.g. 10000 records)
> start a slow consumer - e.g. fetching 50records/poll and pausing 1 sec
> between polls which commits after each poll
> watch the Checkpoint topic in the target cluster
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9192 \
> --topic source.checkpoints.internal \
> --formatter org.apache.kafka.connect.mirror.formatters.CheckpointFormatter \
> --from-beginning
> -> no record appears in the checkpoint topic until the consumer reaches the
> end of the topic (ie its consumer group lag gets down to 0).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)