tustvold commented on issue #1314:
URL: https://github.com/apache/iceberg-rust/issues/1314#issuecomment-2906690237

   > Also another concern is that, if we delegate a core abstraction like 
FileIO to object_store, we may experience unnecessary breaking changes 
introduces when object_store evolves.
   
   FWIW we try very hard to avoid these now, aiming for ~2 breaking releases 
per year. I would eventually like to release a v1.0.0, but that is unlikely to 
be this year.
   
   > We should define small trait rather than a large trait. For example we 
could define traits like following FileIO, SupportsBulkOperations , etc. So 
that concrete implementations could choose which traits to implement according 
their capability.
   
   At least historically this sort of trait composition hasn't worked very well 
with trait objects. In particular you can't write `Box<dyn A + B>`, only 
auto-traits are supported. You _could_ get around this by defining `trait AB: A 
+ B` and then Rust versions > 1.86 can 
[upcast](https://blog.rust-lang.org/2025/04/03/Rust-1.86.0/#trait-upcasting), 
but this would quickly get unmanageable as the number of traits grows.
   
   Whilst returning `Err(Error::Unimplemented)` is not as nice as getting a 
compile error, it is the best approach we've been able to devise.
   
   > We could have an erased trait object definition for traits defined in 1, 
say DynFileIO, which are object safe.
   
   My experience with ObjectStore, is that overtime this interface will likely 
grow. Ultimately ObjectStore started out as precisely this, a very targetted IO 
abstraction for InfluxDB IOx, it then got donated to arrow-rs and overtime as 
more people have used it the interface has grown to its current state to 
accommodate their requirements. I suspect such a `DynFileIO` would likely have 
to go through the same process, especially if the goal is for people to use it 
as more than just a shim to this crate.
   
   > then it would be confusing for people who want to define their own FileIO 
implementation, which part should they implement, and which part they should 
not?
   
   I had been viewing FileIO purely as a mechanism to shim into whatever IO 
abstraction is used by the broader system they're integrating with and not 
really as something people would generally be implementing or interacting with 
themselves.
   
   However, I think this is the key perspective difference that is making 
consensus hard to come by, as I think there are two possible goals being 
discussed here and on the other linked tickets:
   
   1. Provide an extensible IO interface inspired by iceberg-java people can 
use across their applications
   2. Provide a way to integrate their existing ObjectStore based systems with 
this crate
   
   Ultimately there are number of people with a demonstrable need for the 
latter, whereas I believe the former is largely theoretical at this stage. 
Further, as both @roeap and @Sl1mb0 allude to, people want to integrate iceberg 
into a broader system, and so even if iceberg devised a FileIO interface, other 
systems will likely continue to use something more expressive.
   
   In the interests of making some progress perhaps we might find some way to 
decouple these objectives?
   
   Some suggestions from the various tickets:
   
   * Add an ObjectStore variant to the storage enum - 
https://github.com/apache/iceberg-rust/issues/1314#issuecomment-2879956729
   * Add an extension variant to the storage enum - 
https://github.com/apache/iceberg-rust/issues/172#issuecomment-2473264890
   * Make Storage a trait - 
https://github.com/apache/iceberg-rust/issues/1314#issuecomment-2880758072
   * Replace Storage with ObjectStore - 
https://github.com/apache/iceberg-rust/issues/1314#issuecomment-2893032985
   
   All of these would provide a way to solve the pain point that people are 
currently running into without requiring complex design work, and I'd be happy 
to help out once we achieve consensus on what it is we would like to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to