rocco408 commented on PR #11419: URL: https://github.com/apache/iceberg/pull/11419#issuecomment-2466841180
I may be missing something, but after several variations I don't see a way to cleanly use `TypeUtil.isPromotionAllowed` from `UnionByNameVisitor`. Even after adding the long-to-int and double-to-float case in `TypeUtil.isPromotionAllowed` like this ```shell --- a/api/src/main/java/org/apache/iceberg/types/TypeUtil.java +++ b/api/src/main/java/org/apache/iceberg/types/TypeUtil.java @@ -406,9 +406,15 @@ public class TypeUtil { case INTEGER: return to.typeId() == Type.TypeID.LONG; + case LONG: + return to.typeId() != Type.TypeID.INTEGER; + case FLOAT: return to.typeId() == Type.TypeID.DOUBLE; + case DOUBLE: + return to.typeId() != Type.TypeID.FLOAT; + ``` and adding special handling for non-primitive types in `UnionByNameVisitor` like this ```shell --- a/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java +++ b/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java @@ -164,7 +164,7 @@ public class UnionByNameVisitor extends SchemaWithPartnerVisitor<Integer, Boolea String fullName = partnerSchema.findColumnName(existingField.fieldId()); boolean needsOptionalUpdate = field.isOptional() && existingField.isRequired(); - boolean needsTypeUpdate = !compatibleType(field.type(), existingField.type()); + boolean needsTypeUpdate = needsTypeUpdate(field.type(), existingField.type()); boolean needsDocUpdate = field.doc() != null && !field.doc().equals(existingField.doc()); if (needsOptionalUpdate) { @@ -180,8 +180,11 @@ public class UnionByNameVisitor extends SchemaWithPartnerVisitor<Integer, Boolea } } - private boolean compatibleType(Type newType, Type existingType) { - return existingType.isPrimitiveType() && TypeUtil.isPromotionAllowed(newType, existingType.asPrimitiveType()); + private boolean needsTypeUpdate(Type newType, Type existingType) { + if (newType.isPrimitiveType() && !existingType.isPrimitiveType()) { + return true; + } + return newType.isPrimitiveType() && !newType.equals(existingType) && TypeUtil.isPromotionAllowed(existingType, newType.asPrimitiveType()); } ``` I'm hitting failures like with [TestSchemaUnionByFieldName.testDetectInvalidTopLevelList](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/test/java/org/apache/iceberg/TestSchemaUnionByFieldName.java#L242-L250) where it attempts to cast a list of strings to list of longs. By using `TypeUtil.isPromotionAllowed` in `UnionByNameVisitor` we end up ignoring this type of casting because it is not allowed. However in the existing flow, [needsTypeUpdate](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java#L174-L176) ends up getting set to `true` where we expect an exception to eventually be thrown when we use `TypeUtil.isPromotionAllowed` [here](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/main/java/org/apache/iceberg/SchemaUpdate.java#L287-L292). I think this ignore behavior needs to be isolated to `UnionByNameVisitor`. As always I'm open to other ideas if I missed something. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org