langchao209 opened a new issue, #7804:
URL: https://github.com/apache/iceberg/issues/7804

   ### Apache Iceberg version
   
   1.2.0
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   After writing data of the recent latest 30 days to an Iceberg table, I found 
there were duplicated files like this.
   
![image](https://github.com/apache/iceberg/assets/12413139/7ca0644e-c254-46f7-92f5-6308b0800d79)
    When I read from Iceberg table there was no duplication, however, when I 
read from the parquet files, there was duplication data. 
   
   However, when I test with data of the recent 7 days , the duplication is 
gone.
   
![image](https://github.com/apache/iceberg/assets/12413139/7aac34b0-af27-4a3e-ac3c-cab501811263)
   
   Questions:
   1. How is the parquet file name formatted?  
00499-5849-056a590c-e5eb-4d92-99a4-1d0fdaf45e6c-00001.parquet
   2. What is the difference between the suffix 00001 mean?
   3. How to not generate duplicated files?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to