[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752759#comment-17752759
 ] 

Syed Shameerur Rahman commented on HADOOP-18842:
------------------------------------------------

[[email protected]]  Thanks a lot for the pointers.

The following are some of my observations wrt to your comments
 # Yes, This is similar to staging committer's partitioned overwrite. But what 
i could see is that, staging committers during precommit in commitJob operation 
clears all the directories/partitions if the conflict resolution is 
"{color:#871094}REPLACE{color}" . The issue with this approach is that, In 
worst case scenario when the job fails after precommit, The whole data will be 
lost which might not be desirable 
 # I agree that storing all the SinglePendingCommit in memory puts an extra 
memory pressure on the driver. For instance in my setup to store ~1400 pending 
set files into memory took extra 7MB (this number will be different based on 
your S3 bucket or destination name length). So i guess it is not that much.
 # For a high write intensive jobs which commits tens of thousands of files, 
The memory pressure will be more but for such cases, It is recommended to have 
a larger driver memory size anyway.
 # Streaming the SinglePendingCommit to local fileSystem is a great idea but it 
causes extra delay for serialization/deserialization and extra overhead to read 
and write files which may be not desirable in all the cases.

 

*Proposal*
 # Streaming the SinglePendingCommit to local fileSystem only if there are 
large number of pending files. We can do some approximation like 1 pending set 
will be of 'x' bytes and the user is willing to take in such 'y' such 'x' bytes 
into memory
 # On the other case let's store it in memory.

 

For (1) i.e streaming the commits to local filesystem
 # Read pending set in multi-threaded way 
2. for each pending set extract the single commit and corresponding destination 
directory and store the destination in memory 
3. stream the single commit to file with a unique path for each destination 
directory
4. for each destination directory  :
     4.1 delete the destination directory
     4.2 read the commits from the unique path and call commit

**Unique path for each destination directory will help us to limit the number 
of directories/partitions which will be lost in case of failures.

 

[[email protected]] Any thoughts on this?

Thanks

> Support Overwrite Directory On Commit For S3A Committers
> --------------------------------------------------------
>
>                 Key: HADOOP-18842
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18842
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to