Jungtaek Lim created SPARK-51667:
------------------------------------
Summary: [TWS + Python] Disable Nagle's algorithm between Python
worker and State Server
Key: SPARK-51667
URL: https://issues.apache.org/jira/browse/SPARK-51667
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 4.0.0, 4.1.0
Reporter: Jungtaek Lim
During testing TWS + Python, we figured out the case where the socket
communication for state interaction had delayed for more than 40ms, for certain
type of state, e.g. ListState.put(), ListState.get(), ListState.appendList(),
etcetc.
The root cause is figured out as the combination of Nagle's algorithm and
delayed ACK. The sequence is following:
# Python worker sends the proto message to JVM, and flushes the socket.
# Additionally, Python worker sends the follow-up data to JVM, and flushes the
socket.
# JVM reads the proto message, and realizes there is follow-up data.
# JVM reads the follow-up data.
# JVM processes the request, and sends the response back to Python worker.
Due to delayed ACK, even after 3, ACK is not sent back from JVM to Python
worker. It is waiting for some data or multiple ACKs to be sent, but JVM is not
going to send the data during that phase.
Due to Nagle's algorithm, the message from 2 is not sent to JVM since there is
no ACK for the message from 1.
This deadlock situation is resolved after the timeout of delayed ACK, which is
40ms (minimum duration) in Linux. After the timeout, ACK is sent back from JVM
to Python worker, hence Nagle's algorithm allows the message from 2 to be
finally sent to JVM.
See below articles for more general explanation:
* [https://engineering.avast.io/40-millisecond-bug/]
** Start reading from Nagle's algorithm section
* [https://brooker.co.za/blog/2024/05/09/nagle.html]
Nagle's algorithm helps to reduce a lot of small packets, which the above
article states it could help the router from overloaded. We connect to
"localhost" here.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]