Apache9 commented on code in PR #7528:
URL: https://github.com/apache/hbase/pull/7528#discussion_r2616198948
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##########
@@ -216,7 +216,7 @@ public int getTimeout() {
* the context are assumed to be persisted in the target cluster.
Review Comment:
I prefer we control it by time/size limit.
Even if the endpoint can persist the data after every shipment, we do not
need to record the offset every time right? We just need to make sure that once
the `ReplicationSourceShipper` want to record the offset, all the data before
this offset has been persistent. So we can introduce a
`beforePersistingReplicationOffset` method for replication endpoint, if you
persist the data after every shipment, you just need to do nothing. If it is S3
based endpoint, we close the output file to persist the data.
In this way, the ReplicationSourceShipper does not need to know whether the
endpoint can persist the data or not after every shipment. And in the future,
for HBaseInterClusterReplicationEndpoint, we could also introduce some
asynchronous mechanism to increase performance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]