Re: [PR] Add support for Bodo DataFrame [iceberg-python]

via GitHub Mon, 14 Jul 2025 19:07:17 -0700


kevinjqliu commented on code in PR #2167:
URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2206128117



##########
tests/integration/test_writes/test_partitioned_writes.py:
##########
@@ -547,14 +552,14 @@ def test_summaries_with_null(spark: SparkSession, 
session_catalog: Catalog, arro
         "total-records": "6",
     }
     assert summaries[5] == {
-        "removed-files-size": "16174",
+        "removed-files-size": "15774" if under_20_arrow else "16174",

Review Comment:
   lets just do this instead since we're not really testing for the file size 
   ```suggestion
           "removed-files-size": summaries[5]["removed-files-size"],
   ```



##########
tests/integration/test_writes/test_partitioned_writes.py:
##########
@@ -451,6 +451,11 @@ def 
test_dynamic_partition_overwrite_unpartitioned_evolve_to_identity_transform(
 
 @pytest.mark.integration
 def test_summaries_with_null(spark: SparkSession, session_catalog: Catalog, 
arrow_table_with_null: pa.Table) -> None:
+    import pyarrow
+    from packaging import version
+
+    under_20_arrow = version.parse(pyarrow.__version__) < 
version.parse("20.0.0")
+

Review Comment:
   > Any ideas? Maybe use a range of "safe" values instead of a single file 
size value? I'd be happy to open another PR if there is more work for this.
   
   I think we can just parameterize the file size. We're not really testing 
anything related to the size of the file.
   
   > It'd be great if PyIceberg wouldn't set an upper version for Arrow if 
possible.
   
   yea agreed. lets see if we can remove the upper bound
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

Reply via email to