lhotari commented on PR #15908: URL: https://github.com/apache/pulsar/pull/15908#issuecomment-4334026899
It turns out that ZooKeeper TCP keepalive settings contain a major gap. docs: https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#:~:text=tcpKeepAlive%20%3A%20(Java%20system,NIOServerCnxnFactory%20is%20used. > tcpKeepAlive : (Java system property: zookeeper.tcpKeepAlive) New in 3.5.4: Setting this to true sets the TCP keepAlive flag on the sockets used by quorum members to perform elections. This will allow for connections between quorum members to remain up when there is network infrastructure that may otherwise break them. Some NATs and firewalls may terminate or lose state for long-running or idle connections. Enabling this option relies on OS level settings to work properly, check your operating system's options regarding TCP keepalive for more information. Defaults to false. > > clientTcpKeepAlive : (Java system property: zookeeper.clientTcpKeepAlive) New in 3.6.1: Setting this to true sets the TCP keepAlive flag on the client sockets. Some broken network infrastructure may lose the FIN packet that is sent from closing client. These never closed client sockets cause OS resource leak. Enabling this option terminates these zombie sockets by idle check. Enabling this option relies on OS level settings to work properly, check your operating system's options regarding TCP keepalive for more information. Defaults to false. Please note the distinction between it and tcpKeepAlive. It is applied for the client sockets while tcpKeepAlive is for the sockets used by quorum members. Currently this option is only available when default NIOServerCnxnFactory is used. The part "Please note the distinction between it and tcpKeepAlive. It is applied for the client sockets while tcpKeepAlive is for the sockets used by quorum members. Currently this option is only available when default NIOServerCnxnFactory is used." - the `clientTcpKeepAlive` is only applied on the server side for client sockets and it only handles `NIOServerCnxnFactory` - there is no setting to enable tcp keep-alive on the client side -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
