paleolimbot commented on code in PR #8350:
URL: https://github.com/apache/arrow-rs/pull/8350#discussion_r2353095562
##########
arrow-schema/src/field.rs:
##########
@@ -391,6 +398,49 @@ impl Field {
/// ```
pub fn with_data_type(mut self, data_type: DataType) -> Self {
self.set_data_type(data_type);
+ self.without_extension_type()
Review Comment:
Ah, I missed this.
This will probably obliterate metadata in some places. There's a bunch of
code in datafusion that looks like this:
```
let new_fields = fields
.into_iter()
.zip(new_data_types)
.map(|(f, d)| f.as_ref().clone().with_data_type(d))
.map(Arc::new)
.collect::<Vec<FieldRef>>();
```
...where `d` is often unchanged but the outer field may have contained
metadata. Sometimes the data type change is compatible with an extension type
(e.g., reconciling String and StringViews), sometimes it is not (e.g., if there
is any code that reconciles FixedSizeBinary and Binary inputs, which there may
not be).
Using a `Field` as a data type is a constant whack-a-mole of these types of
issues but I think this might be safer propagating the extension metadata until
downstream projects have a chance to move away from that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]