kevinjqliu commented on code in PR #1141:
URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759614689


##########
pyiceberg/io/pyarrow.py:
##########
@@ -1238,10 +1238,12 @@ def _task_to_record_batches(
         for batch in batches:
             next_index = next_index + len(batch)
             current_index = next_index - len(batch)
+            output_batches = iter([batch])
             if positional_deletes:
                 # Create the mask of indices that we're interested in
                 indices = _combine_positional_deletes(positional_deletes, 
current_index, current_index + len(batch))
                 batch = batch.take(indices)
+
                 # Apply the user filter
                 if pyarrow_filter is not None:

Review Comment:
   thanks! 
   So if there are positional_deletes, apply filter manually to the pyarrow 
table. 
   If there are no positional_deletes, pushdown the filter to scanner. 
   
   arrow 17 should make this a lot cleaner
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to