Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

via GitHub Sat, 13 Jan 2024 01:13:54 -0800


javrasya commented on PR #9464:
URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1890390386

Good catches @pvary , thank you. What if we get full inspiration from
writeUTF and have our own writer but supports longer JSON. Btw, the reason why
it limits the size to be 65K max because the first 2 bytes of the serialized
value holds the length of the UTF and that is unsigned short which can be max
65K. I have introduced my own writeUTF and called it writeLongUTF/readLongUTF.
It writes the first bytes which holds the length as int which is 4 bytes
instead of unsigned short.

Do mind taking a look at [those changes
here](https://github.com/apache/iceberg/compare/main...javrasya:iceberg:issue-9410-implement-custom-utf-serde)
and let me know what you think? I didn't want to update this PR directly
without talking to you about it? If you think that is good idea, I can proceed
and merge it on this branch first and we can continue with the discussions here.

But it is not compatible with V2 since that is using initial 2 bytes to
indicate the length. Introducing v3 is good idea as you suggested. But not
really sure how we would be able to distinguish a serialized split with V2
earlier from V3 though 🤔 Do you know how this was done from v1 to v2? Can you
help me there?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

Reply via email to