Understanding the relationship between block size and RPC / IPC length?

Carey, Paul Fri, 08 Nov 2019 01:49:16 -0800

Hi

The NameNode logs in my HDFS instance recently started logging warnings of the 
form `Requested data length 145530837 is longer than maximum configured RPC 
length 144217728`.


This ultimately manifested itself as the NameNode declaring thousands of blocks 
to be missing and 19 files to be corrupt.

The situation was resolved by updating `ipc.maximum.data.length` to a value 
greater than the requested data length listed above. This is not a satisfying 
resolution though. I'd like to understand how this issue occurred.

I've run `hdfs fsck -files -blocks -locations` and the largest block is of 
length `1342177728`.

- Is there some overhead for RPC calls? Could a block of length `1342177728` be 
resulting in the original warning log at the top of this post?
- My understanding is that the only way a client writing to HDFS can specify a 
block size is via either `-Ddfs.blocksize` or setting the corresponding 
property on the `Configuration` object when initialising the HDFS connection. 
Is this correct, or are there any other routes to creating excessively large 
blocks?
- Other than overly large blocks, are there any other issues that could trigger 
the warning above?

Many thanks

Paul

Understanding the relationship between block size and RPC / IPC length?

Reply via email to