ndrluis commented on code in PR #2881:
URL: https://github.com/apache/iceberg-python/pull/2881#discussion_r3307479381


##########
pyiceberg/io/pyarrow.py:
##########
@@ -1641,7 +1645,12 @@ def _task_to_record_batches(
                 bound_row_filter, file_schema, case_sensitive=case_sensitive, 
projected_field_values=projected_missing_fields
             )
             bound_file_filter = bind(file_schema, translated_row_filter, 
case_sensitive=case_sensitive)
-            pyarrow_filter = expression_to_pyarrow(bound_file_filter, 
file_schema)
+            try:
+                pyarrow_filter = expression_to_pyarrow(bound_file_filter, 
file_schema)
+            except pyarrow.lib.ArrowNotImplementedError as e:
+                if "arrow.uuid" in str(e):
+                    raise 
NotImplementedError(UUID_FILTER_NOT_SUPPORTED_ERROR_MESSAGE) from e
+                raise

Review Comment:
   I think keeping this at the PyArrow translation boundary makes more sense 
here. `Table.scan()` is lazy and only returns a `DataScan`, so the Arrow 
expression is not built there and this exception would not be raised from 
`scan()` itself.
   
   Moving it to `to_arrow()` / `to_arrow_batch_reader()` would also duplicate 
PyArrow-specific handling in the public scan layer. Since this error is 
specifically caused by translating the bound filter into a PyArrow expression, 
I’d keep the conversion here while still surfacing a user-facing 
`NotImplementedError`.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to