[jira] [Commented] (HADOOP-18706) Improve S3ABlockOutputStream recovery

ASF GitHub Bot (Jira) Mon, 20 Oct 2025 17:24:40 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18031284#comment-18031284
 ]


ASF GitHub Bot commented on HADOOP-18706:
-----------------------------------------

github-actions[bot] closed pull request #5771: HADOOP-18706: 
S3ABlockOutputStream recovery, and downgrade syncable will call flush rather 
than no-op.
URL: https://github.com/apache/hadoop/pull/5771




> Improve S3ABlockOutputStream recovery
> -------------------------------------
>
>                 Key: HADOOP-18706
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18706
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>            Reporter: Chris Bevard
>            Assignee: Chris Bevard
>            Priority: Minor
>              Labels: pull-request-available
>
> If an application crashes during an S3ABlockOutputStream upload, it's 
> possible to complete the upload if fast.upload.buffer is set to disk by 
> uploading the s3ablock file with putObject as the final part of the multipart 
> upload. If the application has multiple uploads running in parallel though 
> and they're on the same part number when the application fails, then there is 
> no way to determine which file belongs to which object, and recovery of 
> either upload is impossible.
> If the temporary file name for disk buffering included the s3 key, then every 
> partial upload would be recoverable.
> h3. Important disclaimer
> This change does not directly add the Syncable semantics which applications 
> that require {{Syncable.hsync()}} to only return after all pending data has 
> been durably written to the destination path. S3 is not a filesystem and this 
> change does not make it so.
> What is does do is assist anyone trying to implement some post-crash recovery 
> process which
> # interrogates s3 to identofy pending uploads to a specific path and get a 
> list of uploaded blocks yet to be committed
> # scans the local fs.s3a.buffer dir directories to identify in-progress-write 
> blocks for the same target destination. That is those which were being 
> uploaded, queued for uploaded and the single "new data being written to" 
> block for an output stream
> # uploads all those pending blocks
> # generates a new POST to complete a multipart upload with all the blocks in 
> the correct order
> All this patch does is ensure the buffered block filenames include the final 
> path and block ID, to aid in identify which blocks need to be uploaded and 
> what order. 
> h2. warning
> causes HADOOP-18744 -always include the relevant fix when backporting



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-18706) Improve S3ABlockOutputStream recovery

Reply via email to