GitHub user Denovo1998 added a comment to the discussion: Async Geo-Replication 
and Cluster Down Scenarios

`replicationBacklog` is not a separately configurable buffer size in Pulsar. It 
is effectively the backlog of the replicator cursor, i.e. the number of 
source-topic entries that have not yet been acknowledged by the remote-cluster 
replication path.

The `QueueSize` setting you found only controls the internal replication 
producer pending queue and how aggressively the replicator reads from the 
ledger. It does not cap the durable backlog stored on disk. So in your A-down / 
B-still-serving scenario, yes: the B->A replication backlog can keep growing 
even if the local consumer backlog is close to zero.

That happens because local consumer acknowledgements only advance subscription 
cursors. The replicator cursor advances only after the message is successfully 
published to the remote cluster. Until that happens, the entries remain 
retained on cluster B. This is by design, because otherwise Pulsar would be 
silently discarding data that has not yet been replicated.

---

For TTL, the answer is also yes, with one important nuance: message expiry 
applies to replicator cursors too, so expired messages can be removed from the 
replication backlog and therefore will not be replicated once cluster A comes 
back. However, this is not guaranteed to happen exactly at T+5 seconds. TTL 
means the message becomes eligible for expiry after 5 seconds, and the actual 
removal depends on the broker’s periodic expiry check.

So there is no dedicated `maxReplicationBacklog` knob. If you want to bound 
this risk during a remote-cluster outage, the practical controls are backlog 
quota, TTL, or an explicit operational decision to clear the replicator backlog 
and accept data loss for the remote cluster.


GitHub link: 
https://github.com/apache/pulsar/discussions/25519#discussioncomment-16655252

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to