aokolnychyi commented on code in PR #9447:
URL: https://github.com/apache/iceberg/pull/9447#discussion_r1457935454


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/RewriteManifestsProcedure.java:
##########
@@ -118,4 +126,15 @@ private InternalRow[] toOutputRows(RewriteManifests.Result 
result) {
   public String description() {
     return "RewriteManifestsProcedure";
   }
+
+  private RewriteManifestsSparkAction checkAndApplyFilter(
+      RewriteManifestsSparkAction action, String where, Identifier ident) {
+    if (where != null) {
+      Expression expression = filterExpression(ident, where);

Review Comment:
   The reason why we did not add it in the first place is because we persist 
only lower/upper boundaries for manifest partition values. Therefore, filters 
may not necessarily benefit from that. Based on what I've seen, these 
boundaries are rarely helpful when the manifest compaction is needed. Also, we 
have done some improvements to how the manifests are rewritten (not released 
yet). From what I see, it is a matter of a few minutes to rewrite all manifests 
even for large tables because the operation is distributed.
   
   @bknbkn, did you have a chance to try the improvements for the action in 
master? How long does the rewrite take in your tables if you target all 
manifests?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to