dsv27 opened a new issue, #12772:
URL: https://github.com/apache/iceberg/issues/12772

   ### Apache Iceberg version
   
   1.8.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   
   ## When using rewrite_table_path on MoR tables, the deleted files necessary 
for table migration are not created.
   
   I tried to migrate a table that uses the MoR (Merge-on-Read) update mode, 
but I was unsuccessful. Below I provide a synthetic example.
   
   List of files in the stage directory:
   
   ```csv
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/v1.metadata.json,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/v1.metadata.json
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/b76bc37f-0bd6-41b9-b8ca-5603d8d42cea-m0.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/b76bc37f-0bd6-41b9-b8ca-5603d8d42cea-m0.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00002-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=1/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00002-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/v3.metadata.json,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/v3.metadata.json
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00003-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=2/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00003-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00004-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=3/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00004-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/b0c509bf-ee18-48a9-a3d2-d04ee9628392-m0.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/b0c509bf-ee18-48a9-a3d2-d04ee9628392-m0.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00001-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=0/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00001-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/data/id_bucket=3/00001-17-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=3/00001-17-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/data/id_bucket=2/00002-18-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=2/00002-18-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/snap-8339358454957226325-1-db0cfd35-7a77-4026-822d-6aef004a2138.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/snap-8339358454957226325-1-db0cfd35-7a77-4026-822d-6aef004a2138.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/snap-4462110935224015115-1-b76bc37f-0bd6-41b9-b8ca-5603d8d42cea.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/snap-4462110935224015115-1-b76bc37f-0bd6-41b9-b8ca-5603d8d42cea.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/snap-7291782597409380817-1-b0c509bf-ee18-48a9-a3d2-d04ee9628392.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/snap-7291782597409380817-1-b0c509bf-ee18-48a9-a3d2-d04ee9628392.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00001-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=0/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00001-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/v4.metadata.json,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/v4.metadata.json
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00004-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=3/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00004-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/db0cfd35-7a77-4026-822d-6aef004a2138-m0.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/db0cfd35-7a77-4026-822d-6aef004a2138-m0.avro
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00002-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=1/00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00002-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00003-deletes.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=2/00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00003-deletes.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/v2.metadata.json,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/v2.metadata.json
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/data/id_bucket=1/00000-16-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=1/00000-16-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/data/id_bucket=0/00003-19-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/data/id_bucket=0/00003-19-3e0a0841-211a-4cbc-a66d-c696656d1156-0-00001.parquet
   
/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table/metadata/copy-table-staging-454c9095-64cd-430d-b2bb-d03ad2e8fbb3/b0c509bf-ee18-48a9-a3d2-d04ee9628392-m1.avro,/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table/metadata/b0c509bf-ee18-48a9-a3d2-d04ee9628392-m1.avro
   ```
   
   Created files:
   ```bash
    .
   ├──  00000-4-ef5dc966-2145-4f41-aa56-d2f120675381-00004-deletes.parquet
   ├──  00000-9-e834811e-7ccb-4b23-be5f-c95b1158400d-00004-deletes.parquet
   ├──  b0c509bf-ee18-48a9-a3d2-d04ee9628392-m0.avro
   ├──  b0c509bf-ee18-48a9-a3d2-d04ee9628392-m1.avro
   ├──  b76bc37f-0bd6-41b9-b8ca-5603d8d42cea-m0.avro
   ├──  db0cfd35-7a77-4026-822d-6aef004a2138-m0.avro
   ├──  file-list
   ├──  snap-4462110935224015115-1-b76bc37f-0bd6-41b9-b8ca-5603d8d42cea.avro
   ├──  snap-7291782597409380817-1-b0c509bf-ee18-48a9-a3d2-d04ee9628392.avro
   ├──  snap-8339358454957226325-1-db0cfd35-7a77-4026-822d-6aef004a2138.avro
   ├──  v1.metadata.json
   ├──  v2.metadata.json
   ├──  v3.metadata.json
   └──  v4.metadata.json
   ```
   
   As you can see, only 2 delete files have been created, but according to the 
list, 8 delete files should have been processed. Perhaps I'm doing something 
wrong.
   
   How to reproduce the issue:
   
   ```sql
   create table iceberg.stats.table(
                                       id bigint not null,
                                       l0 int not null,
                                       l1 int not null,
                                       value string not null
   ) using iceberg
       PARTITIONED BY (bucket(4, id))
       TBLPROPERTIES 
('write.merge.mode'='merge-on-read','write.delete.mode'='merge-on-read')
   ```
   
   ```scala
   import java.util.concurrent.ThreadLocalRandom
   import scala.util.Random
   import org.apache.commons.codec.digest.MurmurHash3
   
   def bucket(n: Int, x: Long): Int = (MurmurHash3.hash32(x) & 
Integer.MAX_VALUE) % n
   val rand = new Random()
   val random = ThreadLocalRandom.current()
   
   
   (0 until 1000000).map(id => (id, random.nextInt(10, 1000), 
random.nextInt(10, 12), rand.alphanumeric.take(100).mkString(""))).
           toDF("id", "l0", "l1", "value").createOrReplaceTempView("tbl")
   ```
   
   ```sql
   insert into iceberg.stats.table (id, l0, l1, value)
   select id, l0, l1, value from tbl
   ```
   
   ```sql
   delete from iceberg.stats.table where l0=300;
   delete from iceberg.stats.table where l0=305;
   ```
   ```sql
   CALL iceberg.system.rewrite_table_path(
       table => 'iceberg.stats.table',
       source_prefix => 
'/home/dmitrii/infra/opt/cfs/iceberg/warehouse/stats/table',
       target_prefix => 
'/home/dmitrii/infra/opt/cfs/iceberg/warehouse2/stats/table'
   )
   ```
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to