rdblue commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2257156417
I think that the API proposed here is the right approach. @amogh-jahagirdar makes a good point about `compatibleWith` and possibly other methods that we use to reason about partition specs. I think that it is important that we don't introduce case insensitivity as a property of the `PartitionSpec` itself, which would require a spec change, but still support the convenience that this PR introduces. To be more clear, I like this builder method because it can be used to make it easier for engines to configure tables. If the engine is case insensitive, the integration can pass that flag and the user's input to the `PARTITIONED BY` clause doesn't need to be pre-processed. But I think that we do need to ensure that we have consistent behavior for code paths that use the partition name that's based on the original column name. To do that, we just need to make sure that the partition name is derived from the column name in the schema, not the column name passed to the builder methods. That's not what the code does currently. For example, see [`identity(String)`](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L454). I think this PR will need to refactor the builder methods to use a consistent name if this introduces a code path where `sourceName` and `sourceField.name()` may not match. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org