Re: [I] Failed to assign splits due to the serialized split size [iceberg]

via GitHub Fri, 05 Jan 2024 23:21:42 -0800


pvary commented on issue #9410:
URL: https://github.com/apache/iceberg/issues/9410#issuecomment-1879580902


   @javrasya: Table with 14 columns should not cause any issues. The default 
stats also could not cause issues.
   
   I made a mistake reading the code, and combined splits also could not cause 
any issues, as we serialize them one-by-one in  loop. And we have an issue with 
one of them.
   
   My current theory is that we need to check `FileScanTaskParser.toJson` to 
understand what is happening:
   
https://github.com/apache/iceberg/blob/2101ac2e5528049688d4ce3ea2b4db861ea3c78b/core/src/main/java/org/apache/iceberg/FileScanTaskParser.java#L48
   
   Could it be, that you have multiple deletes for the specific split which 
makes the serialized split too big?
   
https://github.com/apache/iceberg/blob/2101ac2e5528049688d4ce3ea2b4db861ea3c78b/core/src/main/java/org/apache/iceberg/FileScanTaskParser.java#L69-L75


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Failed to assign splits due to the serialized split size [iceberg]

Reply via email to