Re: [I] Logic to determine the partitions [iceberg-rust]

via GitHub Wed, 27 Nov 2024 22:41:29 -0800


Fokko commented on issue #728:
URL: https://github.com/apache/iceberg-rust/issues/728#issuecomment-2505367587


   > I think you mean arrow record batch?
   
   Yes :)
   
   This could be a broader discussion on where the responsibilities lie between 
iceberg-rust and the query engine. 
   
   On the read-side there [`Tasks` are passed to the query 
engine](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/scan.rs).
 I think this is a nice and clean boundary between the engine and library. I 
would love to go to a similar API for writes. For example, a table is passed 
in, Iceberg-rust does all the checks to make sure that the input is compatible. 
 It could even make the table compatible, e.g. by apply schema evolution when 
needed, this is very easy using the 
[UnionByName](https://github.com/apache/iceberg-rust/issues/698). Based on the 
input (Arrow Table or equivalent). Similar to the read path, the library comes 
up with a set of write tasks that are passed back to the query engine to write 
out the files and return the `DataFile` with all the statistics and such. 
Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Logic to determine the partitions [iceberg-rust]

Reply via email to