mukund-thakur commented on code in PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#discussion_r1566388267
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/stages/AbstractJobOrTaskStage.java:
##########
@@ -582,19 +605,46 @@ protected final Path directoryMustExist(
* Save a task manifest or summary. This will be done by
* writing to a temp path and then renaming.
* If the destination path exists: Delete it.
+ * This will retry so that a rename failure from abfs load or IO errors
+ * will not fail the task.
* @param manifestData the manifest/success file
* @param tempPath temp path for the initial save
* @param finalPath final path for rename.
- * @throws IOException failure to load/parse
+ * @throws IOException failure to rename after retries.
*/
@SuppressWarnings("unchecked")
protected final <T extends AbstractManifestData> void save(T manifestData,
final Path tempPath,
final Path finalPath) throws IOException {
- LOG.trace("{}: save('{}, {}, {}')", getName(), manifestData, tempPath,
finalPath);
- trackDurationOfInvocation(getIOStatistics(), OP_SAVE_TASK_MANIFEST, () ->
- operations.save(manifestData, tempPath, true));
- renameFile(tempPath, finalPath);
+ boolean success = false;
+ int failures = 0;
+ while (!success) {
+ try {
+ LOG.trace("{}: attempt {} save('{}, {}, {}')",
+ getName(), failures, manifestData, tempPath, finalPath);
+
+ trackDurationOfInvocation(getIOStatistics(), OP_SAVE_TASK_MANIFEST, ()
->
+ operations.save(manifestData, tempPath, true));
+ renameFile(tempPath, finalPath);
Review Comment:
rename may fail in second retry with SourcePathNotFound exception but it
actually succeeded during the first try only. How are we recovering from that?
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/stages/AbstractJobOrTaskStage.java:
##########
@@ -445,9 +448,29 @@ protected Boolean delete(
final boolean recursive,
final String statistic)
throws IOException {
- return trackDuration(getIOStatistics(), statistic, () -> {
- return operations.delete(path, recursive);
- });
+ if (recursive) {
Review Comment:
unable to understand this. How is a recursive flag determining that it is a
dir or a file?
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/impl/ManifestStoreOperations.java:
##########
@@ -97,6 +97,36 @@ public boolean isFile(Path path) throws IOException {
public abstract boolean delete(Path path, boolean recursive)
throws IOException;
+ /**
+ * Forward to {@code delete(Path, true)}
+ * unless overridden.
+ * <p>
+ * If it returns without an error: there is no file at
+ * the end of the path.
+ * @param path path
+ * @return outcome
+ * @throws IOException failure.
+ */
+ public boolean deleteFile(Path path)
+ throws IOException {
+ return delete(path, false);
+ }
+
+ /**
+ * Acquire the delete capacity then call {@code FileSystem#delete(Path,
true)}
Review Comment:
don't see the ratelimiter acquiring the deletecapacity.
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/ManifestCommitterConstants.java:
##########
@@ -143,6 +145,20 @@ public final class ManifestCommitterConstants {
*/
public static final boolean OPT_CLEANUP_PARALLEL_DELETE_DIRS_DEFAULT = true;
+ /**
+ * Should parallel cleanup try to delete teh base first?
+ * Best for azure as it skips the task attempt deletions unless
+ * the toplevel delete fails.
+ * Value: {@value}.
+ */
+ public static final String OPT_CLEANUP_PARALLEL_DELETE_BASE_FIRST =
+ OPT_PREFIX + "cleanup.parallel.delete.base.first";
+
+ /**
+ * Default value of option {@link #OPT_CLEANUP_PARALLEL_DELETE_BASE_FIRST}:
{@value}.
+ */
+ public static final boolean OPT_CLEANUP_PARALLEL_DELETE_BASE_FIRST_DEFAULT =
true;
Review Comment:
As it is bad for GCS, shouldn't the default be false?
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md:
##########
@@ -505,7 +491,7 @@ The core set of Azure-optimized options becomes
<property>
<name>spark.hadoop.fs.azure.io.rate.limit</name>
- <value>10000</value>
+ <value>1000</value>
Review Comment:
I guess you reduced this for testing?
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md:
##########
@@ -523,7 +509,7 @@ And optional settings for debugging/performance analysis
```
spark.hadoop.mapreduce.outputcommitter.factory.scheme.abfs
org.apache.hadoop.fs.azurebfs.commit.AzureManifestCommitterFactory
-spark.hadoop.fs.azure.io.rate.limit 10000
+spark.hadoop.fs.azure.io.rate.limit 1000
Review Comment:
Isn't 1000 too less?
##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/committer/manifest/impl/ManifestStoreOperations.java:
##########
@@ -97,6 +97,36 @@ public boolean isFile(Path path) throws IOException {
public abstract boolean delete(Path path, boolean recursive)
throws IOException;
+ /**
+ * Forward to {@code delete(Path, true)}
+ * unless overridden.
+ * <p>
+ * If it returns without an error: there is no file at
+ * the end of the path.
+ * @param path path
+ * @return outcome
+ * @throws IOException failure.
+ */
+ public boolean deleteFile(Path path)
+ throws IOException {
+ return delete(path, false);
Review Comment:
what if the path is a dir? will there be a case like that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]