bryanck commented on PR #12224: URL: https://github.com/apache/iceberg/pull/12224#issuecomment-2651697343
> @bryanck I didn't quite get the partition summary field names. were you referring to `PartitionFieldSummaryParser`? it seems to have just 4 field names. > > String.intern can be helpful for some use cases while harmful for some (like the one you encountered). Disabling interning seems to be a safer option considering diverse scenarios that the code can be used (like REST catalog server). The information for each partition key has a field name unique to the partition (with the prefix `partition.`). There is some discussion around intern [here](https://github.com/FasterXML/jackson-core/issues/332) with more links. TL;DR is that intern was disabled by default for Jackson 3 (whenever that is released). > I definitely understand the situation you described. maybe reach out to the Jackson authors too according to the doc? https://github.com/fasterxml/jackson-core/wiki/JsonFactory-Features Sure sounds good, I'll reach out. > The doc also mentioned that hash collision check is `Only relevant if canonicalization is enabled`. wondering if `CANONICALIZE_FIELD_NAMES` should be disabled too. I imagined it can cause similar memory footprint issue as String interning. Canonicalization can help when field names are reused within a single metadata file, so that seemed helpful still. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org