[GitHub] [hadoop] virajjasani commented on pull request #4107: HDFS-16521. DFS API to retrieve slow datanodes

GitBox Sun, 10 Apr 2022 02:49:49 -0700


virajjasani commented on PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#issuecomment-1094233257


   To provide more insights, 
[FanOutOneBlockAsyncDFSOutput](https://github.com/apache/hbase/blob/master/hbase-asyncfs/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutput.java)
 in HBase currently has to rely on it's own way of marking and excluding slow 
nodes while 1) creating pipelines and 2) handling ack, based on factors like 
the data length of the packet, processing time with last ack timestamp, whether 
flush to replicas is finished etc. If it can utilize slownode API from HDFS to 
exclude nodes appropriately while writing block, a lot of it's own post-ack 
computation of slow nodes can be _saved_ or _improved_ or based on further 
experiment, we could find _better solution_ to manage slow node detection logic 
both in HDFS and HBase. However, in order to collect more data points and run 
more POC around this area, at least we should expect HDFS to provide API for 
downstreamers to efficiently utilize slownode info for such critical 
low-latency use-case (like w
 riting WALs).
   
   cc @jojochuang @saintstack 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] virajjasani commented on pull request #4107: HDFS-16521. DFS API to retrieve slow datanodes

Reply via email to