ZENOTME commented on code in PR #135:
URL: https://github.com/apache/iceberg-rust/pull/135#discussion_r1445854906


##########
crates/iceberg/src/writer/file_writer/mod.rs:
##########
@@ -0,0 +1,51 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Iceberg File Writer
+
+use super::{CurrentFileStatus, IcebergWriteResult};
+use crate::Result;
+use arrow_array::RecordBatch;
+use arrow_schema::SchemaRef;
+
+/// File writer builder trait.
+#[async_trait::async_trait]
+pub trait FileWriterBuilder: Send + Clone + 'static {
+    /// The associated file writer type.
+    type R: FileWriter;
+    /// Build file writer.
+    async fn build(self, schema: &SchemaRef) -> Result<Self::R>;
+}
+
+/// File writer focus on writing record batch to different physical file 
format.(Such as parquet. orc)
+#[async_trait::async_trait]
+pub trait FileWriter: Send + 'static + CurrentFileStatus {
+    /// The associated file write result type.
+    type R: FileWriteResult;
+    /// Write record batch to file.
+    async fn write(&mut self, batch: &RecordBatch) -> Result<()>;

Review Comment:
   The reason why we use `I` in `IcbergWriter<I>` is that for some writers it 
needs some specific input format rather than RecordBatch. E.g. the 
PositionDeleteWriter needs a write interface like `write(&mut self, file: &str, 
position: usize)`. To support this case, we make `IcebergWriter<I>`.
   
   But all IcebergWriter will write the data into a file using FileWriter 
finally. In this time, IcebergWriter should convert the data into a RecordBatch 
and write them using FileWriter.
   
   E.g. for PositionDeleteWriter, we will use `write(&mut self, file: &str, 
position: usize)`, and PositionDeleteWriter will batch these data like 
   ```
   ---
   file, position,
   file, position,
   ...
   ---
   ```
   , and when it accumulates enough data, it can convert all data into 
RecordBatch and write them using FileWriter.
   
   For me, RecordBatch is a physical data representation, and `I` is like a 
logical representation. For data file writer, the logical representation and 
physical representation is the same. But position delete writer, they are 
different.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to