bryanck opened a new pull request, #8834:
URL: https://github.com/apache/iceberg/pull/8834

   This PR fixes a critical bug in reading split offsets from manifests. A 
change in https://github.com/apache/iceberg/pull/8336 added caching of the 
offsets collection in `BaseFile` to avoid reallocation. However, the Avro 
reader will reuse the same BaseFile object when reading, thus only the offsets 
from the first entry are allocated, and then those are reused for all  other 
entries.
   
   cc @aokolnychyi @RussellSpitzer @rdblue @danielcweeks This can result in 
corrupted metadata, as rewrite manifests will read in the invalid offsets then 
write those back.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to