Fokko commented on code in PR #1879:
URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2044268567


##########
tests/integration/test_deletes.py:
##########
@@ -467,21 +467,19 @@ def 
test_partitioned_table_positional_deletes_sequence_number(spark: SparkSessio
     assert snapshots[2].summary == Summary(
         Operation.OVERWRITE,

Review Comment:
   When I change it into CoW, I get for snapshot summary 1 (the delete 
performend by Spark):
   ```json
   {
       "spark.app.id": "local-1744714815877",
       "added-data-files": "1",
       "deleted-data-files": "1",
       "added-records": "1",
       "deleted-records": "2",
       "added-files-size": "714",
       "removed-files-size": "743",
       "changed-partition-count": "1",
       "total-records": "4",
       "total-files-size": "1461",
       "total-data-files": "2",
       "total-delete-files": "0",
       "total-position-deletes": "0",
       "total-equality-deletes": "0",
       "engine-version": "3.5.1",
       "app-id": "local-1744714815877",
       "engine-name": "spark",
       "iceberg-version": "Apache Iceberg 1.8.0 (commit 
c277c2014a1b37fe755cfe37f173b6465bb8cb73)"
   }
   ```
   
   Which seems correct:
   ```
   (10, 100), 
   (10, 101), <- Deleted by Spark
   
   (20, 200), 
   (20, 201),
   (20, 202)
   ```
   
   PyIceberg has a different approach, where this is an `Overwrite`, and first 
creates a snapshot that rewrites the original data file, then appends a new 
file with the new updated record.
   
   To reproduce this, I just removed the `TBLPROPERTIES` to set MoR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to