ThomasMarquardt commented on a change in pull request #2246:
URL: https://github.com/apache/hadoop/pull/2246#discussion_r489647598



##########
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
##########
@@ -263,10 +265,102 @@ public AbfsRestOperation deleteFilesystem() throws 
AzureBlobFileSystemException
     return op;
   }
 
-  public AbfsRestOperation createPath(final String path, final boolean isFile, 
final boolean overwrite,
-                                      final String permission, final String 
umask,
-                                      final boolean isAppendBlob) throws 
AzureBlobFileSystemException {
+  public AbfsRestOperation createPath(final String path,
+      final boolean isFile,
+      final boolean overwrite,
+      final String permission,
+      final String umask,
+      final boolean isAppendBlob) throws AzureBlobFileSystemException {
+    String operation = isFile
+        ? SASTokenProvider.CREATE_FILE_OPERATION
+        : SASTokenProvider.CREATE_DIRECTORY_OPERATION;
+
+    // if "fs.azure.enable.conditional.create.overwrite" is enabled,
+    // trigger a create with overwrite=false first so that eTag fetch can be
+    // avoided for cases when no pre-existing file is present (which is the
+    // case with most part of create traffic)
+    boolean isFirstAttemptToCreateWithoutOverwrite = false;
+    if (isFile && overwrite
+        && abfsConfiguration.isConditionalCreateOverwriteEnabled()) {
+      isFirstAttemptToCreateWithoutOverwrite = true;
+    }
+
+    AbfsRestOperation op = null;
+    // Query builder
+    final AbfsUriQueryBuilder abfsUriQueryBuilder = 
createDefaultUriQueryBuilder();
+    abfsUriQueryBuilder.addQuery(QUERY_PARAM_RESOURCE,
+        operation.equals(SASTokenProvider.CREATE_FILE_OPERATION)
+            ? FILE
+            : DIRECTORY);
+
+    if (isAppendBlob) {
+      abfsUriQueryBuilder.addQuery(QUERY_PARAM_BLOBTYPE, APPEND_BLOB_TYPE);
+    }
+
+    appendSASTokenToQuery(path, operation, abfsUriQueryBuilder);
+
+    try {
+      op = createPathImpl(path, abfsUriQueryBuilder,
+          (isFirstAttemptToCreateWithoutOverwrite ? false : overwrite),
+          permission, umask, null);
+    } catch (AbfsRestOperationException e) {
+      if ((e.getStatusCode() == HttpURLConnection.HTTP_CONFLICT)
+          && isFirstAttemptToCreateWithoutOverwrite) {
+        // Was the first attempt made to create file without overwrite which
+        // failed because there is a pre-existing file.
+        // resetting the first attempt flag for readabiltiy
+        isFirstAttemptToCreateWithoutOverwrite = false;
+
+        // Fetch eTag
+        try {
+          op = getPathStatus(path, false);
+        } catch (AbfsRestOperationException ex) {
+          if (ex.getStatusCode() == HttpURLConnection.HTTP_NOT_FOUND) {
+            // Is a parallel access case, as file which was found to be
+            // present went missing by this request.
+            throw new ConcurrentWriteOperationDetectedException(
+                "Parallel access to the create path detected. Failing request "
+                    + "to honor single writer semantics");
+          } else {
+            throw ex;
+          }
+        }

Review comment:
       The original design was such that AbfsClient is a thin client over the 
REST API, and AzureBlobFileSystemStore is where the app logic lived, such as 
handling continuation tokens or making multiple calls such as getting the etag 
to use it in a conditional request.  I think we should stick to the original 
design and move this fancy logic to AzureBlobFileSystemStore and expose an 
option on AbfsClient to take an optional condition (the etag) when creating a 
file.  In this way, the update to the AbfsClient would be a single line to add 
the optional "If-Match: E-Tag" request header, but there would be a new method 
in AzureBlobFileSystemStore that implements the new conditional overwrite 
behavior.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to