mkuchenbecker commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1114855356
##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##########
@@ -197,6 +197,15 @@ private void clearLocalDeadNodes() {
deadNodes.clear();
}
+ /**
+ * Clear list of ignored nodes used for hedged reads.
+ */
+ private void clearIgnoredNodes(Collection<DatanodeInfo> ignoredNodes) {
Review Comment:
Nit. Param documentation.
##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##########
@@ -1337,8 +1352,12 @@ private void hedgedFetchBlockByteRange(LocatedBlock
block, long start,
} catch (InterruptedException ie) {
// Ignore and retry
}
- if (refetch) {
- refetchLocations(block, ignored);
+ // If refetch is true, then all nodes are in deadNodes or ignoredNodes.
+ // We should loop through all futures and remove them, so we do not
+ // have concurrent requests to the same node.
+ // Once all futures are cleared, we can clear the ignoredNodes and
retry.
Review Comment:
Ignored and dead nodes are cleared, correct?
##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##########
@@ -224,7 +233,7 @@ boolean deadNodesContain(DatanodeInfo nodeInfo) {
}
/**
- * Grab the open-file info from namenode
+ * Grab the open-file info from namenode.
Review Comment:
Is this change needed?
##########
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##########
@@ -197,6 +197,15 @@ private void clearLocalDeadNodes() {
deadNodes.clear();
}
+ /**
+ * Clear list of ignored nodes used for hedged reads.
+ */
+ private void clearIgnoredNodes(Collection<DatanodeInfo> ignoredNodes) {
Review Comment:
Does it make more sense to clean up ignored / dead in the same function so
we don't need to worry about calling both?
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java:
##########
@@ -603,7 +603,9 @@ public Void answer(InvocationOnMock invocation) throws
Throwable {
input.read(0, buffer, 0, 1024);
Assert.fail("Reading the block should have thrown
BlockMissingException");
} catch (BlockMissingException e) {
- assertEquals(3, input.getHedgedReadOpsLoopNumForTesting());
+ // The result of 9 is due to 2 blocks by 4 iterations plus one because
+ // hedgedReadOpsLoopNumForTesting is incremented at start of the loop.
+ assertEquals(9, input.getHedgedReadOpsLoopNumForTesting());
Review Comment:
We are tripling the IO per hedged request?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]