ajantha-bhat commented on code in PR #12897: URL: https://github.com/apache/iceberg/pull/12897#discussion_r2060142013
########## api/src/main/java/org/apache/iceberg/PartitionField.java: ########## @@ -70,14 +100,14 @@ public boolean equals(Object other) { } PartitionField that = (PartitionField) other; - return sourceId == that.sourceId + return sourceIds.toString().equals(that.sourceIds.toString()) Review Comment: Just wondering why not `Objects.equals(sourceIds, that.sourceIds);` do we need the extra string conversion? ########## .palantir/revapi.yml: ########## @@ -1177,6 +1177,10 @@ acceptedBreaks: old: "class org.apache.iceberg.Metrics" new: "class org.apache.iceberg.Metrics" justification: "Java serialization across versions is not guaranteed" + - code: "java.class.defaultSerializationChanged" Review Comment: Doesn't this breaks backward compatibility? and this is not allowed for minor versions. We need to implement custom `writeObject` and `readObject` to ensure we can serialize the field if it has only `sourceId`? ``` private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException { in.defaultReadObject(); // read all default fields // Ensure backward compatibility if (sourceIds == null && sourceId != null) { sourceIds = Collections.singletonList(sourceId); } } private void writeObject(ObjectOutputStream out) throws IOException { out.defaultWriteObject(); // only write sourceIds } ``` Note: I know we have handled the JSON serialization properly in `PartitionSpecParser`. Just wondering about default java serializer here. ########## core/src/main/java/org/apache/iceberg/PartitionSpecParser.java: ########## @@ -98,7 +101,16 @@ static void toJsonFields(UnboundPartitionSpec spec, JsonGenerator generator) thr generator.writeStartObject(); generator.writeStringField(NAME, field.name()); generator.writeStringField(TRANSFORM, field.transformAsString()); - generator.writeNumberField(SOURCE_ID, field.sourceId()); + if (field.sourceIds().size() == 1) { + generator.writeNumberField(SOURCE_ID, field.sourceId()); + } else { + generator.writeFieldName(SOURCE_IDS); + generator.writeStartArray(); + for (Integer value : field.sourceIds()) { + generator.writeNumber(value); + } + generator.writeEndArray(); + } Review Comment: nit: Iceberg follows the new line spacing for code blocks https://iceberg.apache.org/contribute/?h=contribut#block-spacing ########## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ########## @@ -601,17 +601,26 @@ public Builder alwaysNull(String sourceName) { // add a partition field with an auto-increment partition field id starting from // PARTITION_DATA_ID_START + Builder add(List<Integer> sourceIds, String name, Transform<?, ?> transform) { + return add(sourceIds, nextFieldId(), name, transform); + } + Builder add(int sourceId, String name, Transform<?, ?> transform) { - return add(sourceId, nextFieldId(), name, transform); + return add(List.of(sourceId), name, transform); } - Builder add(int sourceId, int fieldId, String name, Transform<?, ?> transform) { - checkAndAddPartitionName(name, sourceId); - fields.add(new PartitionField(sourceId, fieldId, name, transform)); + Builder add(List<Integer> sourceIds, int fieldId, String name, Transform<?, ?> transform) { + // we use the first entry in the source-ids list here + checkAndAddPartitionName(name, sourceIds.get(0)); Review Comment: How does this work for multi arg partition field? Do we need a logic that accepts the `sourceIds` and resolve the name of multi arg transform? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org