Copilot commented on code in PR #7835:
URL: https://github.com/apache/hadoop/pull/7835#discussion_r2244433207
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestAbfsInputStream.java:
##########
@@ -477,11 +494,14 @@ public void testSuccessfulReadAhead() throws Exception {
AbfsInputStream inputStream = getAbfsInputStream(client,
"testSuccessfulReadAhead.txt");
int beforeReadCompletedListSize =
getBufferManager().getCompletedReadListSize();
- // First read request that triggers readAheads.
+ // First read request always bypasses readahead. This will succeed
inputStream.read(new byte[ONE_KB]);
- // Only the 3 readAhead threads should have triggered client.read
- verifyReadCallCount(client, 3);
+ // This will start triggering readahead requests.
+ inputStream.read((new byte[1024]));
Review Comment:
Inconsistent array size usage. The code uses `new byte[1024]` here while
using `new byte[ONE_KB]` elsewhere in the same file. Using the constant
`ONE_KB` would be more consistent and maintainable.
```suggestion
inputStream.read((new byte[ONE_KB]));
```
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java:
##########
@@ -348,14 +348,20 @@ private int readOneBlock(final byte[] b, final int off,
final int len) throws IO
if (alwaysReadBufferSize) {
bytesRead = readInternal(fCursor, buffer, 0, bufferSize, false);
} else {
- // Enable readAhead when reading sequentially
- if (-1 == fCursorAfterLastRead || fCursorAfterLastRead == fCursor ||
b.length >= bufferSize) {
- LOG.debug("Sequential read with read ahead size of {}", bufferSize);
+ // Switch between enabling and disabling read ahead based on the
workload read pattern.
+ // First Read on input stream should always bypass read ahead.
+ if (firstRead) {
Review Comment:
The `firstRead` variable is referenced but not defined or updated in this
diff. The code checks this variable but there's no logic shown to set it to
false after the first read, which could cause all reads to bypass read-ahead
indefinitely.
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/TestAbfsInputStream.java:
##########
@@ -119,14 +119,14 @@ private AbfsInputStream getAbfsInputStream(AbfsClient
mockAbfsClient,
String fileName) throws IOException {
AbfsInputStreamContext inputStreamContext = new AbfsInputStreamContext(-1);
// Create AbfsInputStream with the client instance
- AbfsInputStream inputStream = new AbfsInputStream(
+ AbfsInputStream inputStream = Mockito.spy(new AbfsInputStream(
mockAbfsClient,
null,
FORWARD_SLASH + fileName,
- THREE_KB,
+ THREE_KB + ONE_KB, // First read will always bypass readahead
Review Comment:
The comment suggests this change is related to first read bypassing
readahead, but increasing the file size by ONE_KB doesn't clearly relate to the
first read behavior. This change appears to be adjusting test data size rather
than directly testing the first read bypass feature.
```suggestion
ONE_KB, // Set file size to match the read buffer size to ensure the
first read bypasses readahead
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]