liuchunhua opened a new issue, #42240:
URL: https://github.com/apache/doris/issues/42240

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   v2.1.6
   
   ### What's Wrong?
   
   Iceberg Dangling Deletes 影响数量统计
   
   test case: doris/samples/datalake/iceberg_and_paimon
   
   ``` shell
   bash start_all.sh
   bash start_doris_client.sh
   ```
   spark:
   ``` sql
   > select version();
   3.5.1 fd86f85e181fc2dc0f50a096855acf83a6cc5d9c
   
   CREATE TABLE demo.db_iceberg.tb_iceberg (
     id BIGINT NOT NULL,
     val STRING)
   USING iceberg
   LOCATION 's3://warehouse/wh/db_iceberg/tb_iceberg'
   TBLPROPERTIES (
     'current-snapshot-id' = '2047510404873857005',
     'format' = 'iceberg/parquet',
     'format-version' = '2',
     'identifier-fields' = '[id]',
     'upsert-enabled' = 'true',
     'write.delete.mode' = 'merge-on-read',
     'write.parquet.compression-codec' = 'zstd',
     'write.update.mode' = 'merge-on-read',
     'write.upsert.enabled' = 'true');
   
   
   insert into demo.db_iceberg.tb_iceberg values(1, 'abd');
   update demo.db_iceberg.tb_iceberg set val = 'def' where id = 1;
   update demo.db_iceberg.tb_iceberg set val = 'hgk' where id = 1;
   call demo.system.rewrite_data_files(table => 'demo.db_iceberg.tb_iceberg', 
options => map('min-input-files', '1'));
   call demo.system.expire_snapshots(table => 'demo.db_iceberg.tb_iceberg', 
older_than => timestamp'2024-10-22 12:41:00');
   ```
   ``` shell
   ~/mc ls minio/warehouse/wh/db_iceberg/tb_iceberg/data/
   [2024-10-22 12:38:36 CST] 1.4KiB STANDARD 
00000-4-c401aec0-dab0-4476-b99e-c67022be3505-00001-deletes.parquet
   [2024-10-22 12:42:41 CST]   637B STANDARD 
00000-624-9bb2caa4-0c97-4588-8f6b-68b72f970905-0-00001.parquet
   [2024-10-22 12:40:03 CST]   646B STANDARD 
00000-7-d78a7a7d-a615-429b-b437-31c66d6a00b0-0-00001.parquet
   ``` 
   doris:
   ```
   mysql> select count(id) from iceberg.db_iceberg.tb_iceberg;
   +-----------+
   | count(id) |
   +-----------+
   |         2 |
   +-----------+
   1 row in set (0.10 sec)
   
   mysql> select count(*) from iceberg.db_iceberg.tb_iceberg; --- wrong
   +----------+
   | count(*) |
   +----------+
   |        1 |
   +----------+
   1 row in set (0.07 sec)
   
   mysql> select * from iceberg.db_iceberg.tb_iceberg;
   +------+------+
   | id   | val  |
   +------+------+
   |    1 | hgk  |
   |    2 | abd  |
   +------+------+
   2 rows in set (0.06 sec)
   ```
   
   ### What You Expected?
   
   正确处理Iceberg Dangling Deletes
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to