snvijaya commented on a change in pull request #2368:
URL: https://github.com/apache/hadoop/pull/2368#discussion_r530768704
##########
File path:
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java
##########
@@ -223,16 +244,19 @@ private int readInternal(final long position, final
byte[] b, final int offset,
// queue read-aheads
int numReadAheads = this.readAheadQueueDepth;
- long nextSize;
long nextOffset = position;
+ // First read to queue needs to be of readBufferSize and later
Review comment:
Read buffer size config being available to be modified, fixed read ahead
buffer size and issuing read aheads by buffer size is the current prod
behaviour and will function as the picture attached. This will need fixing.
And as for deprecating read buffer size config and only use the new read
ahead buffer size config. The config has been available since GA, and hence
deprecating it would not be feasible. (Also for clients who are disabling
readAheads to use readahead buffer size for reads might be confusing too).
As for the number of different configs present for read, 1 and 2 configs
already were present while this PR is introducing 3 and 4. So total of 4
configs.
1. fs.azure.read.request.size
2. fs.azure.readaheadqueue.depth
3. fs.azure.read.alwaysReadBufferSize => For Gen1 migrating customers
4. fs.azure.read.readahead.blocksize => Was one that needed fixing long
back as there is no validation on 4 MB being the right size for all workloads.
Just the way read buffer size can be modified.
All these changes are being added based on various customer issues and
experiences that we are dealing with. Instead of spending our time in providing
patches that can enable them to test various combinations, having these options
over a config for their testing saves our dev time to improve the service. As
you can see in the PR, the defaults introduced by these configs will retain the
current prod behavior.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]