Xuanwo opened a new issue, #1036: URL: https://github.com/apache/iceberg-rust/issues/1036
Hi, I'm starting this thread to discuss if it a good idea to split iceberg-rust high level API into a mini engine instead. In iceberg-rust, we seems to have two different level of APIs. One is the low-level API which expose the `TableMetadata`, `Schema` directly, the other is the high-level API that exposes feature like `Scan`. We have such design to make our users eaiser to get started. However, after using `iceberg-rust` in a production environment, I found that this design is not ideal for an external query engine to optimize performance. We either did too much on the `iceberg-rust` side, such as implementing a hard in-memory cache in our code, or we didn't expose enough of the necessary APIs for query engines, like `ArrowReader`, which prevents us from using or tuning the `Reader` from `arrow-rs` directly. I'm wondering if it would be a good idea to split the current high-level APIs into a mini engine. This way, users who want a quick start or need to run workloads on a single machine can use the mini engine instead. This split would greatly help us design the API: If an API is used in both `mini` and `datafusion`, it should probably belong in our core. However, if an API is only used in `mini`, it should remain within `mini`. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org