[
https://issues.apache.org/jira/browse/KAFKA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490617#comment-17490617
]
RivenSun commented on KAFKA-13576:
----------------------------------
Hi [~rsivaram] [~ijuma] , [~guozhang]
can you give any suggestions?
Thanks.
> Processor.ConnectionQueueSize provides configuration & metrics,
> SelectorMetrics adds connection-register related metrics
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-13576
> URL: https://issues.apache.org/jira/browse/KAFKA-13576
> Project: Kafka
> Issue Type: Improvement
> Components: metrics, network
> Affects Versions: 3.0.0
> Reporter: RivenSun
> Assignee: Luke Chen
> Priority: Major
>
> h1. Problem:
> After all client machines are switched to the company's private BYOIP,
> producers who send messages frequently have a significant increase in time
> consumption. Producers who send messages infrequently often throw out
> exceptions that send messages to obtain metadata timeout. Everything was
> normal before switching
> h1. RC:
> 1. The client's BYOIP lacks DNS-PTR configuration
> 2. When the port uses SASL_SSL protocol, the underlying method
> SaslChannelBuilder#buildTransportLayer of Processor#configureNewConnections
> will call socketChannel.socket().getInetAddress().getHostName() to trigger
> DNS reverse lookup. If clientIp lacks PTR configuration, this will cause
> getHostName() will be time consuming.
> 3. Several steps in the processor's run method are executed serially. If
> configureNewConnections takes time, it will inevitably cause the completed
> response to not be sent to the client in time, resulting in an increase in
> the ack time for the producer to send messages
> 4. ConfigureNewConnections is time-consuming, which will cause the elements
> in Processor.newConnections to not be removed in time, which will increase
> the time-consuming of the Acceptor#assignNewConnection method.
> AssignNewConnection will even block in newConnections.put(socketChannel). At
> this time, the Acceptor thread may reject any new creation TCP connection
> request.
> h1. Solution:
> 1. Add DNS-PTR configuration to the BYOIP of the client
> 2. Kafka high version has fixed this problem,
> https://issues.apache.org/jira/browse/KAFKA-8562
> [https://github.com/apache/kafka/pull/10059]
> 3. Selector Metrics of each processor’s selector, add *connection-register*
> related metrics.
> Selector#register(String id, SocketChannel socketChannel) In this method,
> update the connection-register related indicators, the metrics indicator type
> is expected to use newHistogram, which is similar to the attribute field of
> *responseQueueTimeMs*
> 4.
> 1) The queue size of Processor.newConnections is recommended to be
> configurable
> Source code:
> {code:java}
> private[kafka] object Processor {
> val IdlePercentMetricName = "IdlePercent"
> val NetworkProcessorMetricTag = "networkProcessor"
> val ListenerMetricTag = "listener"
> val ConnectionQueueSize = 20
> }{code}
> The current value is 20, and the code is hard-coded here, perhaps for design
> considerations, but it is still recommended to provide configuration,
> *queued.max.connections* acts on processors of all ports,
> Or the processor of each listener port provides independent configuration
> *listener.name.\{listenerName}.queued.max.connections*
> 2) Provide metrics statistics for each processor’s newConnections queue size:
> {*}ConnectionQueueSize{*}, ConnectionQueueSize metrics can refer to the
> *ResponseQueueSize* maintained in RequestChannel
--
This message was sent by Atlassian Jira
(v8.20.1#820001)