dannycjones commented on code in PR #4478: URL: https://github.com/apache/hadoop/pull/4478#discussion_r904893346
########## hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committers.md: ########## @@ -492,18 +484,19 @@ was written. With the policy of `append`, the new file would be added to the existing set of files. -### Notes +### Notes on using Staging Committers 1. A deep partition tree can itself be a performance problem in S3 and the s3a client, -or, more specifically. a problem with applications which use recursive directory tree +or more specifically a problem with applications which use recursive directory tree walks to work with data. 1. The outcome if you have more than one job trying simultaneously to write data to the same destination with any policy other than "append" is undefined. 1. In the `append` operation, there is no check for conflict with file names. -If, in the example above, the file `log-20170228.avro` already existed, -it would be overridden. Set `fs.s3a.committer.staging.unique-filenames` to `true` +If the file `log-20170228.avro` in the example above already existed, it would be overwritten. + + Set `fs.s3a.committer.staging.unique-filenames` to `true` Review Comment: Using the indentation like this I believe allows you to put the sentence on a new line but still part of the previous point. That being said, it is not obvious from the markdown and I cannot test the output HTML so I'll revert. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
