Fokko commented on code in PR #1482:
URL: https://github.com/apache/iceberg-rust/pull/1482#discussion_r2176483200


##########
crates/iceberg/src/spec/manifest/entry.rs:
##########
@@ -563,6 +563,16 @@ pub(super) fn manifest_schema_v2(partition_type: 
&StructType) -> Result<AvroSche
 
 fn data_file_fields_v1(partition_type: &StructType) -> Vec<NestedFieldRef> {
     vec![
+        // Content is always 1.
+        Arc::new(NestedField::builder()
+            .id(134)
+            .name("content")
+            .required(false)
+            .field_type(Type::Primitive(PrimitiveType::Int))
+            .initial_default(Some(serde_json::Value::Number(1.into())))
+            .write_default(Some(serde_json::Value::Number(1.into())))
+            .build()
+        ),

Review Comment:
   We don't want to write this value for V1 as this would violate the spec. 
Instead, when reading a V1 entry as V2, we want to project the missing fields. 
In the case of content, we want to set an `initial_default` to data here: 
https://github.com/apache/iceberg-rust/blob/69686ba8a02c9d3fa11087aa81409afc1ea348fa/crates/iceberg/src/spec/manifest/entry.rs#L233-L241
   
   When the Avro reader cannot find the field (because it is written using a V1 
writer), it will automatically set it to `data`.
   
   It would also be good to have a round trip test where we have V1 metadata, 
being read by a V2 reader.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to