jackye1995 commented on code in PR #6571: URL: https://github.com/apache/iceberg/pull/6571#discussion_r1081542409
########## docs/java-api.md: ########## @@ -147,6 +147,53 @@ t.newAppend().appendFile(data).commit(); t.commitTransaction(); ``` +### WriteData + +The java api can write data into iceberg table. + +First write data to the data file, then submit the data file, the data you write will take effect in the table. + +For example, add 1000 pieces of data to the table. + +```java +GenericAppenderFactory appenderFactory = new GenericAppenderFactory(table.schema(), table.spec()); + +int partitionId = 1, taskId = 1; +OutputFileFactory outputFileFactory = OutputFileFactory.builderFor(table, partitionId, taskId).format(FileFormat.PARQUET).build(); +final PartitionKey partitionKey = new PartitionKey(table.spec(), table.spec().schema()); Review Comment: Thanks for adding this, I actually get asked a few times how to write Iceberg data without using a compute engine. This answers exactly that question. However, it feels quite hacky to demonstrate this as the solution for people to use, especially the part that we have to use `InternalRecordWrapper` and then create a new class of the abstract `PartitionedFanoutWriter` on the fly. In that case, could we just add that as an actual implementation of `PartitionedFanoutWriter` and contribute that to the codebase? Also I believe there is something similar like `TestTaskWriter` in the tests. Maybe what we should do is bring that out of tests so people can actually use it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org