blackmwk commented on PR #2590:
URL: https://github.com/apache/iceberg-rust/pull/2590#issuecomment-4659480454

   > > I think the key point is the design of MergingSnapshotProduder, which 
contains a lot of indices to speed up the confliction check.
   > 
   > I think you meant 
[`DeleteFileIndex`](https://github.com/apache/iceberg/blob/1cea23eda51c9b9ddcfb88dd499b1fd14f3bf3b3/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L624-L625)
 in java implementation and the `conflictDetectionFilter`. In rust's 
[`DeleteFileIndex`](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/delete_file_index.rs#L56),
 we don't allow filter to be pushed down at this point, so I didn't include 
that change.
   > 
   > In the future, we could store the filter in snapshot operations like 
`RowDeltaAction` and pass the filter to 
`SnapshotValidator::validate_no_new_deletes` easily.
   > 
   > The current implementation won't block that change, we will only need to 
change the API in `SnapshotValidator` after adding conflict_detecting_filter 
support to `DeleteFileIndex`
   
   No, I mean `ManifestMergeManager`, `ManifestFilterManager`, which are 
critial data structures enable efficient concurrency. Also I don't understand 
what a crate private `SnapshotValidator` is used for?
    If you look at java api, each snapshot tx action interface contains 
interface to allow standalone check. For example, 
[RowDelta](https://github.com/apache/iceberg/blob/2e94af5d1f7d71848518348c81d858b2209c3751/api/src/main/java/org/apache/iceberg/RowDelta.java#L32)
 has methods like `validateDeletedFiles`, `validateNoConflictingDataFiles`, 
`validateNoConflictingDeleteFiles`, but they are just marks about what to check 
during committing. This design is quite flexible since they allow different 
checks in differnt isolation levels. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to