Re: [PR] Core, Data, Spark: Moving Spark to use the new FormatModel API [iceberg]

via GitHub Wed, 18 Feb 2026 07:38:02 -0800


pvary commented on code in PR #15328:
URL: https://github.com/apache/iceberg/pull/15328#discussion_r2822960079



##########
spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/data/SparkParquetWriters.java:
##########
@@ -75,10 +77,27 @@
 public class SparkParquetWriters {
   private SparkParquetWriters() {}
 
-  @SuppressWarnings("unchecked")
   public static <T> ParquetValueWriter<T> buildWriter(StructType dfSchema, 
MessageType type) {
+    return buildWriter(null, type, dfSchema);
+  }
+
+  @SuppressWarnings("unchecked")
+  public static <T> ParquetValueWriter<T> buildWriter(
+      Schema icebergSchema, MessageType type, StructType dfSchema) {
+    return (ParquetValueWriter<T>)
+        ParquetWithSparkSchemaVisitor.visit(
+            dfSchema != null ? dfSchema : 
SparkSchemaUtil.convert(icebergSchema),
+            type,
+            new WriteBuilder(type));
+  }
+
+  public static <T> ParquetValueWriter<T> buildWriter(
+      StructType dfSchema, MessageType type, Schema icebergSchema) {
     return (ParquetValueWriter<T>)
-        ParquetWithSparkSchemaVisitor.visit(dfSchema, type, new 
WriteBuilder(type));
+        ParquetWithSparkSchemaVisitor.visit(
+            dfSchema != null ? dfSchema : 
SparkSchemaUtil.convert(icebergSchema),
+            type,
+            new WriteBuilder(type));
   }

Review Comment:
   I’ve given this quite a bit of thought. On the caller side we use the 
order`icebergSchema`, `fileSchema`, `engineSchema`, and I believe this is the 
most logical ordering. If anyone feels strongly otherwise, I’m happy to adjust 
it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Core, Data, Spark: Moving Spark to use the new FormatModel API [iceberg]

Reply via email to