Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

via GitHub Fri, 26 Jan 2024 06:11:16 -0800


Fokko commented on issue #172:
URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1912130317


   Thanks @tustvold for raising this and please don't hesitate to open an issue 
or PR. 
   
   > For example Spark has had a very hard time getting a performant S3 
integration, with proper vectored IO only being added to OSS Spark 
https://github.com/apache/arrow-datafusion/issues/2205#issuecomment-1100069800.
   
   This is why the Iceberg Java implementation ships with its own vectorized 
parquet reader :)
   
   It looks to me that `object_store` and FileIO aim to solve the same problem. 
Iceberg is designed to work on object stores from the start, and not on 
filesystems. Similar to object_store the FileIO concept is very opinionated. 
Since many people are still on HDFS, this is also supported since Filesystems 
offer stronger guarantees than object stores. If you want to learn more about 
the FileIO concept, 
[this](https://tabular.io/blog/iceberg-fileio-cloud-native-tables/) is a good 
primer on the concept.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

Reply via email to