rdblue commented on code in PR #12346: URL: https://github.com/apache/iceberg/pull/12346#discussion_r1994421527
########## api/src/main/java/org/apache/iceberg/types/Types.java: ########## @@ -58,9 +58,15 @@ private Types() {} .put(BinaryType.get().toString(), BinaryType.get()) .put(UnknownType.get().toString(), UnknownType.get()) .put(VariantType.get().toString(), VariantType.get()) + .put(GeometryType.get().toString(), GeometryType.get()) + .put(GeographyType.get().toString(), GeographyType.get()) .buildOrThrow(); private static final Pattern FIXED = Pattern.compile("fixed\\[\\s*(\\d+)\\s*\\]"); + private static final Pattern GEOMETRY_PARAMETERS = + Pattern.compile("(?:\\(\\s*([^, ]+)?\\s*\\))?"); + private static final Pattern GEOGRAPHY_PARAMETERS = + Pattern.compile("(?:\\(\\s*([^, ]+)?\\s*(?:,\\s*(\\w*)\\s*)?\\))?"); Review Comment: I think the intent with the first capture group is to avoid consuming trailing spaces between the group and the comma, but the current pattern would fail when there are spaces in the CRS name, like `EPSG: 4326`. We may want to support names with spaces because they are unambiguous, and we don't have anything that disallows spaces in those names in the specs. Instead of matching non-space characters, I recommend matching any character other than comma using a non-greedy `+` by adding `?`: `[^,]+?` The full regex is `Pattern.compile("(?:\\(\\s*([^,]+?)?\\s*(?:,\\s*(\\w*)\\s*)?\\))?");` and that works: ```java Pattern p2 = Pattern.compile("(?:\\(\\s*([^,]+?)?\\s*(?:,\\s*(\\w*)\\s*)?\\))?"); Matcher m2 = p2.matcher("(EPSG: 4326 , vincenty)"); assertThat(m2.matches()).isTrue(); assertThat(m2.group(1)).isEqualTo("EPSG: 4326"); assertThat(m2.group(2)).isEqualTo("vincenty"); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org