gruuya commented on PR #880: URL: https://github.com/apache/iceberg-rust/pull/880#issuecomment-2646104599
> The existing get batch stream is designed for simple workloads and I'm guessing query engines need to build its own part distribution logic instead. Got it, makes sense. I'm wondering whether it would be beneficial and desirable if `IcebergTableScan`, being a DataFusion-oriented scaning/planning API, had a special implementation relying on DataFusion primitives (i.e. ParquetExec), to squeeze out as much perf as possible. Basically something along the lines of [table_scan](https://github.com/JanKaul/iceberg-rust/blob/ff9c6d84e5cfec67440f09537ac37914ec1cf96a/datafusion_iceberg/src/table.rs#L266) from JanKaul/iceberg-rust. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org