kevinjqliu commented on code in PR #1141: URL: https://github.com/apache/iceberg-python/pull/1141#discussion_r1759614689
########## pyiceberg/io/pyarrow.py: ########## @@ -1238,10 +1238,12 @@ def _task_to_record_batches( for batch in batches: next_index = next_index + len(batch) current_index = next_index - len(batch) + output_batches = iter([batch]) if positional_deletes: # Create the mask of indices that we're interested in indices = _combine_positional_deletes(positional_deletes, current_index, current_index + len(batch)) batch = batch.take(indices) + # Apply the user filter if pyarrow_filter is not None: Review Comment: thanks! So if there are positional_deletes, apply filter manually to the pyarrow table. If there are no positional_deletes, pushdown the filter to scanner. arrow 17 should make this a lot cleaner -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org