ashvina commented on code in PR #11294:
URL: https://github.com/apache/iceberg/pull/11294#discussion_r1796121693


##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java:
##########
@@ -53,19 +63,17 @@ class ADLSLocation {
 
     ValidationException.check(matcher.matches(), "Invalid ADLS URI: %s", 
location);
 
-    String authority = matcher.group(1);
-    String[] parts = authority.split("@", -1);
-    if (parts.length > 1) {
-      this.container = parts[0];
-      this.storageAccount = parts[1];
-    } else {
-      this.container = null;
-      this.storageAccount = authority;
+    try {
+      URI uri = new URI(location);
+      this.container = uri.getUserInfo();
+      // storage account name is the first part of the host
+      int accountSplit = uri.getHost().indexOf('.');
+      String storageAccountName = uri.getHost().substring(0, accountSplit);
+      this.storageAccount = String.format("%s.dfs.core.windows.net", 
storageAccountName);

Review Comment:
   For `wasb`, the original `host` URL is a `blob.core.windows.net` endpoint 
(see the sample above). Could you clarify if the change from blob to dfs is 
necessary?



##########
azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSLocation.java:
##########
@@ -18,24 +18,34 @@
  */
 package org.apache.iceberg.azure.adlsv2;
 
+import java.net.URI;
+import java.net.URISyntaxException;
 import java.util.Optional;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
 import org.apache.iceberg.exceptions.ValidationException;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
 
 /**
- * This class represents a fully qualified location in Azure expressed as a 
URI.
+ * This class represents a fully qualified location in Azure Data Lake 
Storage, expressed as a URI.
  *
  * <p>Locations follow the conventions used by Hadoop's Azure support, i.e.
  *
- * <pre>{@code abfs[s]://[<container>@]<storage account host>/<file 
path>}</pre>
+ * <pre>{@code 
abfs[s]://[<container>@]<storageAccount>.dfs.core.windows.net/<path>}</pre>

Review Comment:
   I think `storageAccount` is more commonly used than `storageHost`. 
([ref](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview#standard-endpoints))
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to