and124578963 opened a new issue, #12467:
URL: https://github.com/apache/iceberg/issues/12467

   ### Apache Iceberg version
   
   1.8.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   ## Description  
   When executing a sequence of deletes (position deletes followed by equality 
deletes and a final row deletion) in Copy-on-Write (COW) mode, the equality 
deletes are not applied to the original data files, resulting in residual data 
that should have been removed. 
   
   **Observed Context**:  
   - The issue **does not occur** when combining `UPDATE` and `MERGE` 
operations in COW mode – these work as expected.  
   - The problem is **specific to COW**; Merge-on-Read (MOR) mode handles the 
same scenario correctly.  
   
   ## Steps to Reproduce
   
   ### 1) Data Setup:
   
   - **Create two Parquet data files:**
   
     - `data-file-1.parquet`: IDs `[1, 2, 3, 4, 5]`
   
     - `data-file-2.parquet`: IDs `[6, 7, 8, 9, 10]`
   
   - Configure the table with COW semantics (`write.delete.mode = 
copy-on-write`).
   
   ### 2) Apply Initial Deletes:
   
   - Add a **position delete file** to remove:
   
      - Row 0 (ID `1`) from `data-file-1.parquet`
   
      - Row 0 (ID `6`) from `data-file-2.parquet`
   
   - Add an **equality delete file** targeting IDs [`3, 4, 5, 6, 7, 8, 9, 10`].
   
   ### 3) Execute Final Delete Command:
   
   
   `DELETE FROM table WHERE id = 2;  -- Targets remaining ID '2'  `
   
   ### Expected Result
    After all deletions:
   
    - `SELECT * FROM table` should return no rows, as:
   
      - Position deletes remove IDs `1` and `6`.
   
      - Equality deletes remove IDs `3, 4, 5` (from `data-file-1`) and `7, 8, 
9, 10` (from `data-file-2`).
   
      - Final `DELETE WHERE id = 2` removes the last remaining ID (`2`).
   
   ### Actual Result
   
   `SELECT * FROM table` returns IDs `3, 4, 5`.
   
   - **Observed Issues:**
   
   The equality deletes targeting `3, 4, 5` (in `data-file-1`) are not applied.
   
   
   The `DELETE WHERE id = 2` operation only removes ID `2`, leaving `3, 4, 5` 
intact.
   
   ## Environment
   Apache Iceberg Versions: `1.6.1`, `1.8.1`
   
   ## Tests
   Example of tests:
   https://github.com/apache/iceberg/compare/1.8.x...and124578963:iceberg:1.8.x
   To run:
   `./gradlew :iceberg-spark:iceberg-spark-3.5_2.12:test --tests 
"org.apache.iceberg.spark.TestSparkExecutionWithEqualityAndPositionDeletes"`
   
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to