Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

via GitHub Sat, 08 Feb 2025 23:19:54 -0800


gruuya commented on PR #880:
URL: https://github.com/apache/iceberg-rust/pull/880#issuecomment-2646104599


   > The existing get batch stream is designed for simple workloads and I'm 
guessing query engines need to build its own part distribution logic instead.
   
   Got it, makes sense. I'm wondering whether it would be beneficial and 
desirable if `IcebergTableScan`, being a DataFusion-oriented scaning/planning 
API, had a special implementation relying on DataFusion primitives (i.e. 
ParquetExec), to squeeze out as much perf as possible. Basically something 
along the lines of 
[table_scan](https://github.com/JanKaul/iceberg-rust/blob/ff9c6d84e5cfec67440f09537ac37914ec1cf96a/datafusion_iceberg/src/table.rs#L266)
 from JanKaul/iceberg-rust.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] feat(datafusion): Expose DataFusion statistics on an IcebergTableScan [iceberg-rust]

Reply via email to