Xuanwo opened a new issue, #1036:
URL: https://github.com/apache/iceberg-rust/issues/1036

   Hi, I'm starting this thread to discuss if it a good idea to split 
iceberg-rust high level API into a mini engine instead. 
   
   In iceberg-rust, we seems to have two different level of APIs. One is the 
low-level API which expose the `TableMetadata`, `Schema` directly, the other is 
the high-level API that exposes feature like `Scan`. We have such design to 
make our users eaiser to get started.
   
   However, after using `iceberg-rust` in a production environment, I found 
that this design is not ideal for an external query engine to optimize 
performance. We either did too much on the `iceberg-rust` side, such as 
implementing a hard in-memory cache in our code, or we didn't expose enough of 
the necessary APIs for query engines, like `ArrowReader`, which prevents us 
from using or tuning the `Reader` from `arrow-rs` directly.
   
   I'm wondering if it would be a good idea to split the current high-level 
APIs into a mini engine. This way, users who want a quick start or need to run 
workloads on a single machine can use the mini engine instead. 
   
   This split would greatly help us design the API: If an API is used in both 
`mini` and `datafusion`, it should probably belong in our core. However, if an 
API is only used in `mini`, it should remain within `mini`.
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to