CTTY opened a new issue, #1540:
URL: https://github.com/apache/iceberg-rust/issues/1540
### Is your feature request related to a problem or challenge?
As a part of #1382 , we need to implement `insert_into` for
`IcebergTableProvider` to support `INSERT INTO` query in datafusion:
```
insert into t value (1, 'a');
```
### Physical Plans
Within `insert_into`, we will need to add a few nodes / Datafusion physical
plans to complete the write process. And the entire write process can be
described by the flowchart below:
```mermaid
flowchart TD
A(["Input Node"]) --> F["Project Node"]
F --> B["Repartition Node"]
B --> C["Sort Node"]
C --> D["Writer Node"]
D --> E["Commit Node"]
```
- Input Node: Input physical plan that represents the input data
- [ ] Project Node: Caculate partition value
- [ ] Repartition Node: Decide when the partitioning mode for the best
parallelism
- [ ] Sort Node: Sort the input data
- [ ] Writer Node: Spawn Iceberg writers and write the input data
- [ ] Commit Node: Commit the data written using Iceberg Tx API
### Writer Extension
Except writers mentioned in the writer path of #1382 , there are other
writers that can be useful:
- [ ] Implement `RollingFileWriter`: Helps split incoming data into multiple
files
### Describe the solution you'd like
_No response_
### Willingness to contribute
None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]