Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

via GitHub Fri, 12 Jan 2024 22:03:52 -0800


pvary commented on PR #9464:
URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1890335313


   Let's take a step back before rushing to a solution. Here are some things we 
have to solve:
   - Serializing long Strings 
   - Serializing extra chars, like Chinese chars
   - Backward compatibility to read old serialized splits
   - Performance will become even more important, as we have long buffers, and 
potentially many splits
   
   My current ideas:
   - Compatibility: We might have to introduce SerializerV3
   - Extra chars: Is 2 bytes enough for all chars? For me, some research would 
be needed
   - Performance: if possible, reusing buffers
   
   Thanks for starting the work on this @javrasya !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

Reply via email to