nothing-go-reade commented on issue #13973: URL: https://github.com/apache/iceberg/issues/13973#issuecomment-3247912244
The writing scenario at that time was as follows: 0. Prerequisites: The table fields have non-null constraints, but I write some fields as null during the writing process. 1. I convert the data into GenericRecord, then call PartitionedFanoutWriter to use RollingFileWriter to roll into data files. However, during this process, an exception occurs, and there are no hints as to where the issue arises. The calling party only notices that these rows of data were not written. 2. After checking the underlying logic, I found that the problem might lie in BaseRollingWriter during the file rolling process, where this.currentRows++ increments the row count of the current file, but the actual data doesn’t exist in the file because of the empty fields. 3. I also tried to reload the parquet data files locally using the call local.system.add_files command, but still couldn’t scan the entire table. In summary, I believe there’s an issue with the writing process, where I’m not checking whether the data conforms to the table constraints. I should perform necessary checks before writing, or I could consider other approaches. <img width="506" height="687" alt="Image" src="https://github.com/user-attachments/assets/2cabc54b-7d53-4854-8470-670535dca64f" /> <img width="536" height="606" alt="Image" src="https://github.com/user-attachments/assets/f296ac12-7d4a-4385-a5f5-e774528043e4" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
