rocco408 commented on PR #11419:
URL: https://github.com/apache/iceberg/pull/11419#issuecomment-2466841180

   I may be missing something, but after several variations I don't see a way 
to cleanly use `TypeUtil.isPromotionAllowed` from `UnionByNameVisitor`. Even 
after adding the long-to-int and double-to-float case in 
`TypeUtil.isPromotionAllowed` like this
   ```shell
   --- a/api/src/main/java/org/apache/iceberg/types/TypeUtil.java
   +++ b/api/src/main/java/org/apache/iceberg/types/TypeUtil.java
   @@ -406,9 +406,15 @@ public class TypeUtil {
          case INTEGER:
            return to.typeId() == Type.TypeID.LONG;
   
   +      case LONG:
   +        return to.typeId() != Type.TypeID.INTEGER;
   +
          case FLOAT:
            return to.typeId() == Type.TypeID.DOUBLE;
   
   +      case DOUBLE:
   +        return to.typeId() != Type.TypeID.FLOAT;
   +
   ```
   and adding special handling for non-primitive types in `UnionByNameVisitor` 
like this
   ```shell
   --- a/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java
   +++ b/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java
   @@ -164,7 +164,7 @@ public class UnionByNameVisitor extends 
SchemaWithPartnerVisitor<Integer, Boolea
        String fullName = partnerSchema.findColumnName(existingField.fieldId());
   
        boolean needsOptionalUpdate = field.isOptional() && 
existingField.isRequired();
   -    boolean needsTypeUpdate = !compatibleType(field.type(), 
existingField.type());
   +    boolean needsTypeUpdate = needsTypeUpdate(field.type(), 
existingField.type());
        boolean needsDocUpdate = field.doc() != null && 
!field.doc().equals(existingField.doc());
   
        if (needsOptionalUpdate) {
   @@ -180,8 +180,11 @@ public class UnionByNameVisitor extends 
SchemaWithPartnerVisitor<Integer, Boolea
        }
      }
   
   -  private boolean compatibleType(Type newType, Type existingType) {
   -    return existingType.isPrimitiveType() && 
TypeUtil.isPromotionAllowed(newType, existingType.asPrimitiveType());
   +  private boolean needsTypeUpdate(Type newType, Type existingType) {
   +    if (newType.isPrimitiveType() && !existingType.isPrimitiveType()) {
   +      return true;
   +    }
   +    return newType.isPrimitiveType() && !newType.equals(existingType) && 
TypeUtil.isPromotionAllowed(existingType, newType.asPrimitiveType());
      }
   ```
   I'm hitting failures like with 
[TestSchemaUnionByFieldName.testDetectInvalidTopLevelList](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/test/java/org/apache/iceberg/TestSchemaUnionByFieldName.java#L242-L250)
 where it attempts to cast a list of strings to list of longs. By using 
`TypeUtil.isPromotionAllowed` in `UnionByNameVisitor` we end up ignoring this 
type of casting because it is not allowed. However in the existing flow, 
[needsTypeUpdate](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/main/java/org/apache/iceberg/schema/UnionByNameVisitor.java#L174-L176)
 ends up getting set to `true` where we expect an exception to eventually be 
thrown when we use `TypeUtil.isPromotionAllowed` 
[here](https://github.com/rocco408/iceberg/blob/fff9ec3bbc322080da6363b657415b039c0e92a0/core/src/main/java/org/apache/iceberg/SchemaUpdate.java#L287-L292).
   
   I think this ignore behavior needs to be isolated to `UnionByNameVisitor`. 
As always I'm open to other ideas if I missed something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to