dramaticlly commented on PR #10678: URL: https://github.com/apache/iceberg/pull/10678#issuecomment-2226008338
> > @sl255051 appreciate you are taking the stub for the PR. > > But I am wondering why do you think column name case insensitivity is the right behavior when building PartitionSpec? I think in iceberg schema we can have both column named `data` and `DATA` with each different field id assigned, like below > > ``` > > table { > > 1: id: required int > > 2: data: required string > > 3: DATA: required string > > } > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Would this change introduce additional ambiguity when resolve a column name in a case insensitive way? > > Thanks for taking the time to review my PR. I did notice that the Schema object uses a simple Map<String, Integer> for column names which means the schema is case sensitive. But I wonder if that is a bug too. I believe partition columns should be case-insensitive based on this issue #83. That issue says to make Iceberg case-insensitive. I can see lots of work was done to enable case-insensitivity in Iceberg. Several objects even have multiple methods to enable case-insensitivity. Take the Schema object as an example. If case-insensitivity is not a feature of Iceberg why would that class have both methods, `findField` and `caseInsensitiveFindField`? > > In summary, I believe case-insensitivity is the correct path forward. I can accept that I may not have implemented in the best way. If that is the case I would appreciate some pointers on how best to implement case-insensitivity. I am not fully aware of the current status of case sensitivity support in iceberg as it's not documented in the spec, maybe we can ask if any of the experts want to chime in @rdblue or @RussellSpitzer But as you mentioned if current schema supports case sensitivity, I dont think it's correct to build partition spec when finding column by name in a case insensitive manner, as it introduce additional ambiguity per my example illustrated above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org