Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-24 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2496527112 Hi, @BlakeOrth Thanks for trying this, and yes it's quite close to what's in my mind. So the question is that we can't keep the return type of `reader`/`writer` function? I

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-24 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2496515292 > I _think_ that should work, the DataFusion wrapper can just hook the iceberg metadata operations into via that StorageProvider trait, and then use the DataFusion machinery

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-19 Thread via GitHub
BlakeOrth commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2486497892 @liurenjie1024 I have taken some time to explore an implementation based on your suggestion above, just as I did for the user extensible `Storage` proposed earlier. Unfortunatel

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-18 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2483162776 I _think_ that should work, the DataFusion wrapper can just hook the iceberg metadata operations into via that StorageProvider trait, and then use the DataFusion machinery direct

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-18 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2483125592 > > point about the existing Datafusion machinery > > DataFusion provides an [ObjectStoreRegistry](https://docs.rs/datafusion/latest/datafusion/datasource/object_store

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-18 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2482458320 > point about the existing Datafusion machinery DataFusion provides an [ObjectStoreRegistry](https://docs.rs/datafusion/latest/datafusion/datasource/object_store/trait.Obje

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-17 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2481838237 > Sounds like a good compromise, did you have any thoughts on how this might integrate with the existing Datafusion machinery? I'm mainly thinking for configuration, so user

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-17 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2481271106 Sounds like a good compromise, did you have any thoughts on how this might integrate with the existing Datafusion machinery? I'm mainly thinking for configuration, so users get a

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-17 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2481267603 Thanks for everyone joining the discussion here. I think we have reached some conclusions here: 1. We need to support different storages, like s3, google cloud storage.

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
BlakeOrth commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2474390321 Wow, I did not expect my comment digging up a nearly year old issue to result in this much discussion! I likely have less skin in the game than any of the maintainers here

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
alamb commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473264890 One potential API that would not cause disruption to the current consumers would be to add a extension trait like we do with [UserDefinedLogicalNode](https://docs.rs/datafusion/late

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
Xuanwo commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473869889 Hi, the `FileIOProvider`, `FileIOExtension`, and `FileIO` trait all look good to me. We can initiate a design to gradually implement them. I can imagine having crates like `iceberg

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
JanKaul commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473732060 To increase the ergonomics one could introduce an additional trait that would be implemented on top of `ObjectStore`, like so: ```rust #[async_trait] pub trait IcebergStor

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
gruuya commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473675530 +1 for making `FileIO` a trait (if not using `ObjectStore` directly). There's an analogy here with delta-rs, in that it also has its own high-level abstraction called [LogSt

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
JanKaul commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473365948 My two cents: I think it would make sense to list all requirements. The main requirements I can think of are: 1. Abstraction over different object stores 2. Harmon

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473277943 > while making FileIO trait quite limited due to object safe in rust The transformation from the current abstraction to an object safe version _could_ be done largely mecha

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473145048 I agree that we should make `FileIO` pluggable and allowing user to choose. But I have a concern to make `FileIO` trait since this forces user to use `Arc` everywhere while

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472934593 > present FileIO as a struct capable of parsing configurations and being used universally without concerns about lifecycle or generic parameters What about adding a FileIOP

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472807387 > instance that implements the FileIO trait? At the moment FileIO is not a trait but a struct - https://docs.rs/iceberg/latest/iceberg/io/struct.FileIO.html I person

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
Fokko commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472808925 To add some context here. As mentioned, `FileIO` should be pluggable. Right now, I've noticed that the `FileIO` is an `impl`, where I would [expect a `trait`](https://github.com/apa

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
Xuanwo commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472846826 Yes, I think it should be a trait as well. Its current form has historical reasons, and I would be happy to refine it into the shape we envisioned. If someone wants to work in this

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
Xuanwo commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472855355 By the way, we will need to present `FileIO` as a struct capable of parsing configurations and being used universally without concerns about lifecycle or generic parameters. The tr

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-13 Thread via GitHub
alamb commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472613139 @BlakeOrth I don't fully understand what you are trying to accomplish. If the goal is to use iceberg-rust without OpenDAL, I thought it was possible to implement an adapter/

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-12 Thread via GitHub
Xuanwo commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2472488509 Hi, thank you, @BlakeOrth, for bringing this up. It's part of our community's philosophy not to choose a winner. All implementations of Iceberg don't directly expose the und

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-11-12 Thread via GitHub
BlakeOrth commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-2471199556 Hi, I'm a relatively new user of the `icberg-rust` crate(s) and was hoping I could bring this discussion back to get some movement here. While Iceberg is a relatively new ecosys

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1912898101 > Would you be open to a PR to allow using either OpenDAL or object_store, along with corresponding feature flags, or would you prefer to not complicate matters at this time

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1912160916 > It looks to me that object_store and FileIO aim to solve the same problem That's awesome, thank you for the link. That is exactly what object_store is, an opinionated abs

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
Fokko commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1912130317 Thanks @tustvold for raising this and please don't hesitate to open an issue or PR. > For example Spark has had a very hard time getting a performant S3 integration, with pr

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911993011 > I think if users are judicious and provide sufficients hints, and buffer the reads the performance difference will be negligible. If primarily performing sequential IO I

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911945629 Thanks everyone for this very nice discussion. > I'd be happy to help out with this, if you're open to contributions, both myself and my employer are very interested i

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
alamb commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911845190 Thank you all -- this is a great conversation. > I entirely agree, I guess I was more suggesting that the IO abstraction mirror object_store as this is what both the upstream

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-26 Thread via GitHub
tustvold commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911697303 Thank you both for the responses. > In iceberg's design, all file ios are hidden under the FileIO interface, and the backends, i.e. OpenDAL or object_store are not directly

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-25 Thread via GitHub
Xuanwo commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911446953 Hi @tustvold, thank you for initiating this discussion! I will do my best to offer a multifaceted response with different hat. ## Put iceberg-rust developers hat on As

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-25 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911328155 cc @Xuanwo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-25 Thread via GitHub
liurenjie1024 commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1911328016 Hi, @tustvold @alamb Thanks for this proposal and write up, [object_store](https://crates.io/crates/object_store) looks great to me! In iceberg's design, all file ios

Re: [I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-25 Thread via GitHub
alamb commented on issue #172: URL: https://github.com/apache/iceberg-rust/issues/172#issuecomment-1910847728 cc @liurenjie1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[I] Consider Using object_store as IO Abstraction [iceberg-rust]

2024-01-25 Thread via GitHub
tustvold opened a new issue, #172: URL: https://github.com/apache/iceberg-rust/issues/172 I have debated filing this ticket for a while, but largely held off as I wasn't sure how well it would be received, especially as I am acutely aware that this crate currently makes use of OpenDAL and @