[
https://issues.apache.org/jira/browse/HADOOP-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17636570#comment-17636570
]
ASF GitHub Bot commented on HADOOP-18533:
-----------------------------------------
huxinqiu commented on PR #5151:
URL: https://github.com/apache/hadoop/pull/5151#issuecomment-1321736995
> @huxinqiu Thank you very much for your contribution!
>
> We need to discuss something:
>
> 1. It seems that the benefit is to avoid declaring this variable
`ResponseBuffer`, bringing the initialized 1024 byte.
> Then we moved the original internal calculation code directly to the
outside.
>
> > modified code
>
> ```
> int computedSize = connectionContextHeader.getSerializedSize();
> computedSize += CodedOutputStream.computeUInt32SizeNoTag(computedSize);
> int messageSize = message.getSerializedSize();
> computedSize += messageSize;
> computedSize += CodedOutputStream.computeUInt32SizeNoTag(messageSize);
> byte[] dataLengthBuffer = new byte[4];
> dataLengthBuffer[0] = (byte)((computedSize >>> 24) & 0xFF);
> dataLengthBuffer[1] = (byte)((computedSize >>> 16) & 0xFF);
> dataLengthBuffer[2] = (byte)((computedSize >>> 8) & 0xFF);
> dataLengthBuffer[3] = (byte)(computedSize & 0xFF);
> ```
>
> > The original calculation code is like this
connectionContextHeader.writeDelimitedTo(buf)
>
> ```
> int serialized = this.getSerializedSize();
> int bufferSize =
CodedOutputStream.computePreferredBufferSize(CodedOutputStream.computeRawVarint32Size(serialized)
+ serialized);
> CodedOutputStream codedOutput = CodedOutputStream.newInstance(output,
bufferSize);
> codedOutput.writeRawVarint32(serialized);
> this.writeTo(codedOutput);
> codedOutput.flush();
> ```
>
> > ResponseBuffer#setSize
>
> ```
> @Override
> public int size() {
> return count - FRAMING_BYTES;
> }
> void setSize(int size) {
> buf[0] = (byte)((size >>> 24) & 0xFF);
> buf[1] = (byte)((size >>> 16) & 0xFF);
> buf[2] = (byte)((size >>> 8) & 0xFF);
> buf[3] = (byte)((size >>> 0) & 0xFF);
> }
> ```
>
> 2. code duplication
> The following calculation logic appears 3 times
>
> ```
> this.dataLengthBuffer = new byte[4];
> dataLengthBuffer[0] = (byte)((computedSize >>> 24) & 0xFF);
> dataLengthBuffer[1] = (byte)((computedSize >>> 16) & 0xFF);
> dataLengthBuffer[2] = (byte)((computedSize >>> 8) & 0xFF);
> dataLengthBuffer[3] = (byte)(computedSize & 0xFF);
> this.header = header;
> this.rpcRequest = rpcRequest;
> ```
>
> RpcProtobufRequestWithHeader#Constructor SaslRpcClient#sendSaslMessage
Client#writeConnectionContext
@slfan1989
1. Yes, IpcStreams#out is a BufferedOutputStream, which has a byte array
inside it, and protobuf's CodedOutputStream also has a byte array cache inside
to optimize writing, we don't need to aggregate dataLength, header and
rpcRequest into a ResponseBuffer which is actually a byte array.
The only extra performance cost in the RpcRequestSender thread is the
serialization of protobuf, usually the request size is only a few hundred
bytes, and the serialization will only cost tens of microseconds. So I think it
is better to first calculate the request size, and then write the dataLength,
header and rpcRequest to the BufferedOutputStream one by one, so as to avoid
requesting a 1024-byte array for each request.
2.I'll fix these code duplication afterwards
> RPC Client performance improvement
> ----------------------------------
>
> Key: HADOOP-18533
> URL: https://issues.apache.org/jira/browse/HADOOP-18533
> Project: Hadoop Common
> Issue Type: Improvement
> Components: rpc-server
> Reporter: xinqiu.hu
> Priority: Minor
> Labels: pull-request-available
>
> The current implementation copies the rpcRequest and header to a
> ByteArrayOutputStream in order to calculate the total length of the sent
> request, and then writes it to the socket buffer.
> But if the rpc engine is ProtobufRpcEngine2, we can pre-calculate the
> request size, and then send the request directly to the socket buffer,
> reducing a memory copy. And avoid allocating 1024 bytes of ResponseBuffer
> each time a request is sent.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]