paleolimbot commented on code in PR #8350:
URL: https://github.com/apache/arrow-rs/pull/8350#discussion_r2353095562


##########
arrow-schema/src/field.rs:
##########
@@ -391,6 +398,49 @@ impl Field {
     /// ```
     pub fn with_data_type(mut self, data_type: DataType) -> Self {
         self.set_data_type(data_type);
+        self.without_extension_type()

Review Comment:
   Ah, I missed this.
   
   This will probably obliterate metadata in some places. There's a bunch of 
code in datafusion that looks like this:
   
   ```
               let new_fields = fields
                   .into_iter()
                   .zip(new_data_types)
                   .map(|(f, d)| f.as_ref().clone().with_data_type(d))
                   .map(Arc::new)
                   .collect::<Vec<FieldRef>>();
   ```
   
   ...where `d` is often unchanged but the outer field may have contained 
metadata. Sometimes the data type change is compatible with an extension type 
(e.g., reconciling String and StringViews), sometimes it is not (e.g., if there 
is any code that reconciles FixedSizeBinary and Binary inputs, which there may 
not be).
   
   Using a `Field` as a data type is a constant whack-a-mole of these types of 
issues but I think this might be safer propagating the extension metadata until 
downstream projects have a chance to move away from that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to