manuzhang opened a new issue, #11397: URL: https://github.com/apache/iceberg/issues/11397
### Apache Iceberg version main (development) ### Query engine Flink ### Please describe the bug 🐞 https://github.com/apache/iceberg/actions/runs/11525609495/job/32088339013 ``` java.lang.AssertionError: Expecting size of: [GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=3/00002-0-2c5978f8-a440-4c4f-be50-86c438ff63b7-00017.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=3}, record_count=11, file_size_in_bytes=1278, column_sizes=org.apache.iceberg.util.SerializableMap@184, value_counts=org.apache.iceberg.util.SerializableMap@1b, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@cb1481a6, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@d4f5ff3d, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=2/00002-0-2c5978f8-a440-4c4f-be50-86c438ff63b7-00018.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=2}, record_count=4, file_size_in_bytes=1138, column_sizes=org.apache.iceberg.util.SerializableMap@fe, value_counts=org.apache.iceberg.util.SerializableMap@12, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@969add9a, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@76639efe, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=1/00000-0-0861575b-99f3-410e-bf09-19007260ee22-00017.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=1}, record_count=7, file_size_in_bytes=1197, column_sizes=org.apache.iceberg.util.SerializableMap@137, value_counts=org.apache.iceberg.util.SerializableMap@f, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@8edc3f28, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@36d4a917, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=0/00000-0-0861575b-99f3-410e-bf09-19007260ee22-00018.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=0}, record_count=11, file_size_in_bytes=1280, column_sizes=org.apache.iceberg.util.SerializableMap@18a, value_counts=org.apache.iceberg.util.SerializableMap@1b, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@1c065f83, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@575e24b0, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=2/00001-0-c258fabc-b05b-4d5a-9930-714bdb254951-00018.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=2}, record_count=8, file_size_in_bytes=1218, column_sizes=org.apache.iceberg.util.SerializableMap@14e, value_counts=org.apache.iceberg.util.SerializableMap@1e, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@4f1d4eba, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@f741ceee, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=1/00001-0-c258fabc-b05b-4d5a-9930-714bdb254951-00017.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=1}, record_count=10, file_size_in_bytes=1266, column_sizes=org.apache.iceberg.util.SerializableMap@178, value_counts=org.apache.iceberg.util.SerializableMap@1c, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@ca85bf2b, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@123a5e0e, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=1/00000-0-0861575b-99f3-410e-bf09-19007260ee22-00019.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=1}, record_count=1, file_size_in_bytes=1084, column_sizes=org.apache.iceberg.util.SerializableMap@a1, value_counts=org.apache.iceberg.util.SerializableMap@5, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@b38647c, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@b38647c, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=1/00001-0-c258fabc-b05b-4d5a-9930-714bdb254951-00019.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=1}, record_count=2, file_size_in_bytes=1099, column_sizes=org.apache.iceberg.util.SerializableMap@db, value_counts=org.apache.iceberg.util.SerializableMap@4, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@961f8456, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@b4913cb9, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}, GenericDataFile{content=data, file_path=file:/tmp/junit5_hadoop_catalog-14525356505085993747/fc6fcd29-95ad-4477-8066-cd11f59528ba/default/t/data/ts_hour=2024-10-25-21/uuid_bucket=3/00002-0-2c5978f8-a440-4c4f-be50-86c438ff63b7-00019.parquet, file_format=PARQUET, spec_id=0, partition=PartitionData{ts_hour=480525, uuid_bucket=3}, record_count=1, file_size_in_bytes=1084, column_sizes=org.apache.iceberg.util.SerializableMap@a1, value_counts=org.apache.iceberg.util.SerializableMap@5, null_value_counts=org.apache.iceberg.util.SerializableMap@6, nan_value_counts=org.apache.iceberg.util.SerializableMap@0, lower_bounds=org.apache.iceberg.SerializableByteBufferMap@49f6be09, upper_bounds=org.apache.iceberg.SerializableByteBufferMap@49f6be09, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=0, data_sequence_number=7, file_sequence_number=7}] to be less than or equal to 7 but was 9 at org.apache.iceberg.flink.sink.TestFlinkIcebergSinkRangeDistributionBucketing.testParallelism(TestFlinkIcebergSinkRangeDistributionBucketing.java:222) at org.apache.iceberg.flink.sink.TestFlinkIcebergSinkRangeDistributionBucketing.testBucketNumberHigherThanWriterParallelismNotDivisible(TestFlinkIcebergSinkRangeDistributionBucketing.java:166) [TaskExecutorFileMergingManager shutdown hook] INFO org.apache.flink.runtime.state.TaskExecutorFileMergingManager - Shutting down TaskExecutorFileMergingManager. ``` ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [X] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org