toutane commented on issue #2220: URL: https://github.com/apache/iceberg-rust/issues/2220#issuecomment-4162176372
Hey 👋, I have a working POC for this, I drafted it in https://github.com/apache/iceberg-rust/pull/2298 The implementation is slightly different from the approach described here: rather than modifying `IcebergTableScan`, it introduces a new `IcebergPartitionedScan` execution plan + a dedicated `IcebergPartitionedTableProvider`. The provider collects all `FileScanTasks` at plan time, then `IcebergPartitionedScan` maps one DataFusion partition per task, each executing independently via `ArrowReaderBuilder`. This keeps the existing `IcebergTableScan` untouched and lets users opt into the parallel path explicitly by registering the partitioned provider. Let me know if this direction sounds good, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
