CTTY commented on issue #1650:
URL: https://github.com/apache/iceberg-rust/issues/1650#issuecomment-3259415564
Hi @liurenjie1024 , thanks for the inputs! Your idea sounds good to me and I
agree that we should make smaller steps if possible. Next I'll try to make a
draft based on it!
One thing I'm not too sure about the `PartitioningWriter` interface is that
the incoming `batch` may still contain rows from different partitions (e.g.
when the user has a partitioned table and wants to go with round robin
partitioning mode to avoid partition skew)
```rust
pub trait PartitioningWriter {
// if `batch` here contains data from multiple partitions,
// then the entire batch would still be written to the partition of
`partition_key`
fn write(&self, partition_key: PartitionKey, batch: RecordBatch);
}
```
I'm thinking of something like this:
```rust
pub trait PartitioningWriter {
// use record batch splitter to split the incoming batch first
fn write(&self, batch: RecordBatch);
// the `batch` here should be splitted only
// technically this shouldn't be public accessible
fn do_write(&self, partition_key: PartitionKey, splitted_batch:
RecordBatch);
}
```
Please lmk your thoughts!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]