rdblue commented on code in PR #14510:
URL: https://github.com/apache/iceberg/pull/14510#discussion_r2496280751


##########
kafka-connect/kafka-connect/src/main/java/org/apache/iceberg/connect/channel/Coordinator.java:
##########
@@ -284,6 +290,25 @@ private void commitToTable(
     }
   }
 
+  private SnapshotUpdateValidator offsetValidator(
+      TableIdentifier tableIdentifier, Map<Integer, Long> expectedOffsets) {
+
+    return (baseSnapshots, updatedSnapshots) -> {

Review Comment:
   If `updatedSnapshots` is never used, why include it?
   
   While reviewing the PR for the core changes, I wanted to see how this was 
used in this PR. The reason I wanted to see was that I wanted to check what is 
coming from the updated snapshots here, rather than creating the validation as 
a closure that has the information about what is being committed. It turns out 
that this uses the closure.
   
   I think that I would approach this by sending just the new information (that 
is, the new snapshot history) into the validation. That accomplishes a few 
things:
   1. It is simpler and callers don't need to think about whether "updated" is 
the local changes or the committed changes. While writing this comment, I 
nearly referred to the base snapshots iterable as "updated" so it can be 
confusing.
   2. The validation can be called from within the existing `validate` method 
on `SnapshotProducer`:
   
   ```java
     /**
      * Validate the current metadata.
      *
      * <p>Child operations can override this to add custom validation.
      *
      * @param currentMetadata current table metadata to validate
      * @param snapshot ending snapshot on the lineage which is being validated
      */
     protected void validate(TableMetadata currentMetadata, Snapshot snapshot) 
{}
   ```
   
   All the existing validations currently go through that method, so it would 
be best if this did as well. That passes the current table metadata and the 
latest snapshot on the target branch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to