Re: [PR] POC to avoid usage of ReplicationResult [hbase]

via GitHub Thu, 11 Dec 2025 05:24:37 -0800


Apache9 commented on code in PR #7528:
URL: https://github.com/apache/hbase/pull/7528#discussion_r2610485310



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java:
##########
@@ -459,7 +458,8 @@ public ReplicationResult replicate(ReplicateContext 
replicateContext) {
       try {
         // replicate the batches to sink side.
         parallelReplicate(replicateContext, batches);
-        return ReplicationResult.COMMITTED;
+        
getReplicationSource().cleanupHFileRefsAndPersistOffsets(replicateContext.getEntries());

Review Comment:
   OK I think I get why you call this method here, since here we can make sure 
that the wal entries have been persistent, it is OK for us to persist the 
offset. But for me, I prefer we follow the old way where call this in 
ReplicationSourceShipper.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/VerifyWALEntriesReplicationEndpoint.java:
##########
@@ -59,10 +60,11 @@ private void checkCell(Cell cell) {
   }
 
   @Override
-  public ReplicationResult replicate(ReplicateContext replicateContext) {
+  public boolean replicate(ReplicateContext replicateContext) throws 
IOException {
     replicateContext.entries.stream().map(WAL.Entry::getEdit).flatMap(e -> 
e.getCells().stream())
       .forEach(this::checkCell);
-    return ReplicationResult.COMMITTED;
+    
getReplicationSource().cleanupHFileRefsAndPersistOffsets(replicateContext.getEntries());

Review Comment:
   Why we need to call this here?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##########
@@ -246,42 +230,11 @@ void shipEdits(WALEntryBatch entryBatch) {
     }
   }
 
-  private ReplicationResult getReplicationResult() {

Review Comment:
   As said above, we should keep these methods here, and before calling these 
methods, we call the method in ReplicationEndpoint out to flush data out.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##########
@@ -216,7 +216,7 @@ public int getTimeout() {
    * the context are assumed to be persisted in the target cluster.

Review Comment:
   What I mean is that, we should add a method may be called 
`beforePersistingReplicationOffset`, and call it before we call 
updateLogPosition in ReplicationSourceShipper method. For old implementation, 
we just do nothing as we can make sure that every thing is persistent, and for 
S3 based endpoint, we close the writer to persist the data on S3.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] POC to avoid usage of ReplicationResult [hbase]

Reply via email to