kaushikranjan commented on issue #12704: URL: https://github.com/apache/iceberg/issues/12704#issuecomment-2779488001
FYI - We have been facing the same issue in our cluster as well. We have a iceberg table with schema CREATE TABLE iceberg.user ( customer_id VARCHAR(100) NOT NULL, id VARCHAR(100) NOT NULL, created_on TIMESTAMP(6) NOT NULL, updated_on TIMESTAMP(6) NOT NULL ) WITH ( format = 'PARQUET', format_version = 2, partitioning = ARRAY['bucket(customer_id, 20)'], sorted_by = ARRAY['id'], ); customer_id and id are both guid values and unique. Here is the data distribution, which is fairly even across all partitions <img width="558" alt="Image" src="https://github.com/user-attachments/assets/9f9e312e-335f-41fa-a3f0-d0bca3f45a24" /> When running compaction, we are also facing the same issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org