mapshen commented on issue #4686: URL: https://github.com/apache/incubator-pinot/issues/4686#issuecomment-882691896
Believe we encountered the same issue while running Pinot 0.7.1. It tried to reconnect but was always failing...we had to restart the server to get it reconnected. @arthuersundar Wondering if you have a figured out a way to work around this? Sample error logs: ``` 2021/07/19 15:41:12.922 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (Disconnected) 2021/07/19 15:43:45.796 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Opening socket connection to server <zookeeper>/10.120.4.65:2181. Will not attempt to authenticate using SASL (unknown error) 2021/07/19 15:43:45.797 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Socket connection established, initiating session, client: /10.120.4.65:55506, server: <zookeeper>/10.120.4.65:2181 2021/07/19 15:43:45.797 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Unable to reconnect to ZooKeeper service, session 0x10148643fc40107 has expired 2021/07/19 15:43:45.797 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Unable to reconnect to ZooKeeper service, session 0x10148643fc40107 has expired, closing socket connection 2021/07/19 15:43:45.797 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (Expired) 2021/07/19 15:43:45.797 ERROR [SegmentBuildTimeLeaseExtender] [pool-5-thread-1] Failed to send lease extension for table4__0__16__20210719T1310Z 2021/07/19 15:43:45.798 INFO [ZooKeeper] [Start a Pinot [SERVER]-EventThread] Initiating client connection, connectString=<zookeeper>:2181 sessionTimeout=30000 watcher=org.apache.helix.manager.zk.ZkClient@32e529d9 2021/07/19 15:43:45.798 INFO [ClientCnxnSocket] [Start a Pinot [SERVER]-EventThread] jute.maxbuffer value is 4194304 Bytes 2021/07/19 15:43:45.798 INFO [ClientCnxn] [Start a Pinot [SERVER]-EventThread] zookeeper.request.timeout value is 0. feature enabled= 2021/07/19 15:43:45.798 INFO [ClientCnxn] [Start a Pinot [SERVER]-EventThread] EventThread shut down for session: 0x10148643fc40107 2021/07/19 15:43:45.801 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Opening socket connection to server <zookeeper>/10.120.4.65:2181. Will not attempt to authenticate using SASL (unknown error) 2021/07/19 15:43:45.801 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Socket connection established, initiating session, client: /10.120.4.65:55508, server: <zookeeper>/10.120.4.65:2181 2021/07/19 15:43:45.802 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Session establishment complete on server <zookeeper>/10.120.4.65:2181, sessionid = 0x10148643fc40108, negotiated timeout = 30000 2021/07/19 15:43:45.802 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (SyncConnected) 2021/07/19 15:43:45.802 INFO [FetchSessionHandler] [table3__0__2__20210718T0705Z] [Consumer clientId=consumer-12082, groupId=] Error sending fetch request (sessionId=1769794914, epoch=21) to node 10: org.apache.kafka.common.errors.DisconnectException. 2021/07/19 15:45:02.256 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Client session timed out, have not heard from server in 76356ms for sessionid 0x10148643fc40108 2021/07/19 15:45:02.256 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Client session timed out, have not heard from server in 76356ms for sessionid 0x10148643fc40108, closing socket connection and attempting reconnect 2021/07/19 15:45:02.257 INFO [FetchSessionHandler] [table0__0__9__20210719T1330Z] [Consumer clientId=consumer-12446, groupId=] Error sending fetch request (sessionId=1597105468, epoch=305) to node 10: org.apache.kafka.common.errors.DisconnectException. 2021/07/19 15:45:02.260 INFO [FetchSessionHandler] [table2__0__7__20210719T1407Z] [Consumer clientId=consumer-12456, groupId=] Error sending fetch request (sessionId=348900988, epoch=3) to node 10: org.apache.kafka.common.errors.DisconnectException. 2021/07/19 15:45:02.357 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (Disconnected) 2021/07/19 15:45:02.359 INFO [FetchSessionHandler] [table1_current__0__5__20210719T1358Z] [Consumer clientId=consumer-12450, groupId=] Error sending fetch request (sessionId=1078185856, epoch=2) to node 10: org.apache.kafka.common.errors.DisconnectException. 2021/07/19 15:46:19.437 INFO [AbstractCoordinator] [table1__0__40__20210719T1359Z] [Consumer clientId=consumer-12452, groupId=] Group coordinator <kafka>:9092 (id: 2147483581 rack: null) is unavailable or invalid, will attempt rediscovery 2021/07/19 15:46:19.437 INFO [FetchSessionHandler] [table1__0__40__20210719T1359Z] [Consumer clientId=consumer-12452, groupId=] Error sending fetch request (sessionId=1433443208, epoch=1) to node 10: org.apache.kafka.common.errors.DisconnectException. 2021/07/19 15:46:19.538 INFO [AbstractCoordinator] [table1__0__40__20210719T1359Z] [Consumer clientId=consumer-12452, groupId=] Discovered group coordinator <kafka>:9092 (id: 2147483581 rack: null) 2021/07/19 15:47:38.527 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Opening socket connection to server <zookeeper>/10.120.4.65:2181. Will not attempt to authenticate using SASL (unknown error) 2021/07/19 15:47:38.527 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Socket connection established, initiating session, client: /10.120.4.65:55622, server: <zookeeper>/10.120.4.65:2181 2021/07/19 15:47:38.528 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Unable to reconnect to ZooKeeper service, session 0x10148643fc40108 has expired 2021/07/19 15:47:38.528 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Unable to reconnect to ZooKeeper service, session 0x10148643fc40108 has expired, closing socket connection 2021/07/19 15:47:38.528 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (Expired) 2021/07/19 15:47:38.528 INFO [ZooKeeper] [Start a Pinot [SERVER]-EventThread] Initiating client connection, connectString=<zookeeper>:2181 sessionTimeout=30000 watcher=org.apache.helix.manager.zk.ZkClient@32e529d9 2021/07/19 15:47:38.531 INFO [ClientCnxnSocket] [Start a Pinot [SERVER]-EventThread] jute.maxbuffer value is 4194304 Bytes 2021/07/19 15:47:38.531 INFO [ClientCnxn] [Start a Pinot [SERVER]-EventThread] zookeeper.request.timeout value is 0. feature enabled= 2021/07/19 15:47:38.532 INFO [ClientCnxn] [Start a Pinot [SERVER]-EventThread] EventThread shut down for session: 0x10148643fc40108 2021/07/19 15:47:38.534 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Opening socket connection to server <zookeeper>/10.120.4.65:2181. Will not attempt to authenticate using SASL (unknown error) 2021/07/19 15:47:38.534 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Socket connection established, initiating session, client: /10.120.4.65:55628, server: <zookeeper>/10.120.4.65:2181 2021/07/19 15:47:38.535 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Session establishment complete on server <zookeeper>/10.120.4.65:2181, sessionid = 0x10148643fc40109, negotiated timeout = 30000 2021/07/19 15:47:38.535 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (SyncConnected) 2021/07/19 15:49:00.015 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Client session timed out, have not heard from server in 81480ms for sessionid 0x10148643fc40109 2021/07/19 15:49:00.016 INFO [ClientCnxn] [Start a Pinot [SERVER]-SendThread(<zookeeper>:2181)] Client session timed out, have not heard from server in 81480ms for sessionid 0x10148643fc40109, closing socket connection and attempting reconnect 2021/07/19 15:50:16.493 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zookeeper state changed (Disconnected) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org