void-ptr974 commented on PR #25946: URL: https://github.com/apache/pulsar/pull/25946#issuecomment-4638702486
Thanks for digging into this flaky. The tombstone handling and the added diagnostics are definitely moving this in the right direction. I found two issues that I think should be addressed before merging. The main concern is that the new reconciliation logic can still fail to converge in a mixed migration scenario: a fresh destination-side write that lands during the startup gap is preserved, but never copied back to the source, so the two views can remain divergent until timeout. The second issue is test cleanup coverage: if activation fails before the test enters the try/finally block, the dynamic config can remain enabled and poison following tests. I think the safer direction is to make reconciliation key/value based instead of size/timestamp based: compare key sets, copy source-only items to the destination, copy fresh destination-only items back to the source unless they are confirmed unchanged pre-clean stale entries, and avoid using cross-broker wall-clock timestamps to decide whether an item is fresh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
