bryanck commented on code in PR #11294: URL: https://github.com/apache/iceberg/pull/11294#discussion_r1803520775
########## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java: ########## @@ -50,27 +59,23 @@ class ADLSLocation { Preconditions.checkArgument(location != null, "Invalid location: null"); Matcher matcher = URI_PATTERN.matcher(location); - - ValidationException.check(matcher.matches(), "Invalid ADLS URI: %s", location); - - String authority = matcher.group(1); - String[] parts = authority.split("@", -1); - if (parts.length > 1) { - this.container = parts[0]; - this.storageAccount = parts[1]; - } else { - this.container = null; - this.storageAccount = authority; + if (!matcher.matches()) { + throw new IllegalArgumentException(String.format("Invalid ADLS URI: %s", location)); } - String uriPath = matcher.group(2); - uriPath = uriPath == null ? "" : uriPath.startsWith("/") ? uriPath.substring(1) : uriPath; - this.path = uriPath.split("\\?", -1)[0].split("#", -1)[0]; + try { + URI uri = new URI(location); Review Comment: Not using `java.net.URI` before was intentional, in the past we have avoided using `java.net.URI` for parsing as it can cause some unexpected results. For example, underscores in host names... ``` jshell> new java.net.URI("abfs://my_endpoint/path").getHost() $1 ==> null ``` There are some other quirks also IIRC. (cc @rdblue since you've pointed this out to me) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org