Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

via GitHub Mon, 29 Jul 2024 16:05:17 -0700


rdblue commented on PR #10678:
URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2257156417

I think that the API proposed here is the right approach.

@amogh-jahagirdar makes a good point about `compatibleWith` and possibly
other methods that we use to reason about partition specs. I think that it is
important that we don't introduce case insensitivity as a property of the
`PartitionSpec` itself, which would require a spec change, but still support
the convenience that this PR introduces.

To be more clear, I like this builder method because it can be used to make
it easier for engines to configure tables. If the engine is case insensitive,
the integration can pass that flag and the user's input to the `PARTITIONED BY`
clause doesn't need to be pre-processed. But I think that we do need to ensure
that we have consistent behavior for code paths that use the partition name
that's based on the original column name. To do that, we just need to make sure
that the partition name is derived from the column name in the schema, not the
column name passed to the builder methods.

That's not what the code does currently. For example, see
[`identity(String)`](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L454).
I think this PR will need to refactor the builder methods to use a consistent
name if this introduces a code path where `sourceName` and `sourceField.name()`
may not match.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] #10668 - Support case-insensitivity for column names in PartitionSpec [iceberg]

Reply via email to