Fokko commented on PR #8625:
URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1813465479

   > do we need any changes in readers to benefit from this? If not, can we run 
some existing benchmarks to showcase the read improvement is as we anticipate?
   
   Since we use the decoders from Avro itself, we don't need any changes. The 
relevant code is here: 
https://github.com/apache/avro/blob/main/lang/java/avro/src/main/java/org/apache/avro/io/BinaryDecoder.java#L398-L424
   
   It will speed up the reading tremendously when we don't need to read in the 
`map[int, bytes]` that we use to store statistics. This way you can jump right 
over them without having to skip each key-value individually.
   
   > Question. Aren't we using DataFileWriter from Avro in our 
AvroFileAppender? If so, how is this PR affecting it? Won't we still use direct 
encoders there?
   
   This is a good question. The goal of this PR is to write the block sizes for 
the manifests. @rustyconover any thoughts on this?
   
   >  Also, nice work on a new encoder in Avro, @Fokko! Do you know when will 
that be available?
   
   Thanks! I can check in with the Avro community to see if we can do a release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to