felixscherz commented on PR #650:
URL: https://github.com/apache/iceberg-python/pull/650#issuecomment-2094148537

   Hi, I finally had some time to continue working on this.
   
   Based on your suggestions @geruh I added a `tell` method to the 
`OutputStream` protocol that returns the number of bytes written to the stream.
   I then added `__len__` to the `AvroOutputFile` which calls out to either 
`OutputFile` or `OutputStream` to get the number of bytes written, depending on 
whether the stream is closed or not.
   Finally I extended `ManifestWriter` with a `__len__` method that calls 
`AvroOutputFile`.
   
   I initially tried to extend `OutputStream` with `__len__` until I realized 
that both `FileIO` implementations `fsspec` and `pyarrow` offer `OutputStream` 
implementations that implement the `tell` method while neither supports 
`__len__`.
   
   If we wanted to go with `__len__` instead of simply using `tell` we might 
have to implement custom `FsspecOutputStream` and `PyarrowOutputStream` classes 
that implement `__len__`. This might well be the cleaner approach but introduce 
a bit more abstraction.
   
   What do you think?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to