Re: [PR] feat: Add read support for Parquet bloom filters [iceberg-python]

via GitHub Fri, 07 Nov 2025 10:18:59 -0800


ForeverAngry commented on code in PR #2653:
URL: https://github.com/apache/iceberg-python/pull/2653#discussion_r2504888395



##########
pyiceberg/manifest.py:
##########
@@ -290,6 +290,13 @@ def __repr__(self) -> str:
             required=False,
             doc="ID representing sort order for this file",
         ),
+        NestedField(
+            field_id=146,
+            name="bloom_filter_bytes",
+            field_type=MapType(key_id=147, key_type=IntegerType(), 
value_id=148, value_type=BinaryType()),
+            required=False,
+            doc="Map of column id to bloom filter",
+        ),

Review Comment:
   @Fokko take a look now, i changed the spirit of the PR so that it:
   
   - doesn't modify the Iceberg specification
   - Doesn't change any existing behavior
   
   Rather, this pr just provides the initial utilities needed to read bloom 
filters from Parquet files at the file level.
   
   If merged, next steps would be integrating them into the read path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Add read support for Parquet bloom filters [iceberg-python]

Reply via email to