Jakov, I'm not sure about the coordinator / non-coordinator behavior you're describing, but I see the other behavior. It doesn't seem quite right.
Here is a detailed explanation of what I see. LocatorLoadSnapshot.getServerForConnection increments the LoadHolder's connections which updates its load. In the receiver's case, getServerForConnection is called by the remote sender to get a server to connect to (as you said using a ClientConnectionRequest). Normally, the LoadHolder's load is updated when load is received in LocatorLoadSnapshot.updateMap (also as you said using a CacheServerLoadMessage). This doesn't happen in the receiver's case. Locator behavior: When the receiver connects, it sends its profile to the locator which causes its LoadHolder to be added to the __recv__group map. It is not added to the null group map. Thats the map for normal servers that have no group. Based on current implementation, it can't be added to this map or it would be used for normal local (to the receiver) client connections. LocatorLoadSnapshot.addServer location=192.168.1.5:5409; groups=[__recv__group] LocatorLoadSnapshot.addGroups group=__recv__group; location=192.168.1.5:5409 LocatorLoadSnapshot.addGroups not adding to the null map group=__recv__group; location=192.168.1.5:5409 When the load is received for the receiver, it is ignored. updateMap gets the LoadHolder from the null group map. Since that map was not updated for the receiver, there is no entry for it (holder=null below). LocatorLoadSnapshot.updateLoad about to update connectionLoadMap location=192.168.1.5:5409 LocatorLoadSnapshot.updateMap location=192.168.1.5:5409; load=0.0; loadPerConnection=0.00125 LocatorLoadSnapshot.updateMap location=192.168.1.5:5409; load=0.0; loadPerConnection=0.00125; holder=null LocatorLoadSnapshot.updateLoad ignoring load location=192.168.1.5:5409 LocatorLoadSnapshot.updateLoad done update connectionLoadMap location=192.168.1.5:5409 So, load is not updated in this way for a receiver. When a request for a remote receiver is received, it uses the __recv__group load to provide that server. It also increments the load to that server (in LoadHolder.incConnections). This is how load is updated for a receiver. LocatorLoadSnapshot.getServerForConnection group=__recv__group LocatorLoadSnapshot.getServerForConnection group=__recv__group; potentialServers={192.168.1.5:5409@192.168.1.5(ln-1:81083)<v1>:41002=LoadHolder[0.0, 192.168.1.5:5409, loadPollInterval=5000, 0.00125]} LoadHolder.incConnections location=192.168.1.5:5409; load=0.00125 LocatorLoadSnapshot.getServerForConnection group=__recv__group; usingServer=192.168.1.5:5409 Receiver server behavior: When a receiver gets a new connection, the ServerConnection.processHandShake updates the LoadMonitor. The LoadMonitor explicitly does not update the connection count because isClientOperations=false, so the load is never changed on the server. This is interesting behavior. I'm not sure if it wasn't updated because it was unnecessary given the behavior of the locator above. ServerConnection.processHandShake about to update the LoadMonitor communicationMode=gateway LoadMonitor.connectionOpened isClientOperations=false; type=GatewayReceiverStatistics LoadMonitor.connectionOpened did not increment connectionCount=0; type=GatewayReceiverStatistics ServerConnection.processHandShake done update the LoadMonitor The LoadMonitor on the server does send the load periodically to the locator but only because skippedLoadUpdates>forceUpdateFrequency (which is 10 times through the polling loop by default): PollingThread.run got load type=GatewayReceiverStatistics; load=Load(0.0, 0.00125, 0.0, 1.0) PollingThread.run forceUpdateFrequency=true PollingThread.run about to send CacheServerLoadMessage type=GatewayReceiverStatistics; load=Load(0.0, 0.00125, 0.0, 1.0) So, the load sent by the server is not accurate and its ignored in the locator. I haven't had a chance to look at your PR yet. Barry ________________________________ From: Jakov Varenina <jakov.varen...@est.tech> Sent: Thursday, March 10, 2022 1:21 AM To: dev@geode.apache.org <dev@geode.apache.org> Subject: Question related to gateway-receivers connection load balancing Hi devs, We have observed some weird behavior related to load balancing of gateway-receivers connections in the geode cluster. Currently, gateway-receiver connection load is only updated on coordinator locator when it provides server location to remote gateway-sender in ClientConnectionRequest{group=__recv_group...}/ClientConnectionResponse messages exchange. Other locators never update gateway-receiver connection load, since they are not handling these messages. Additionally, locators (including the coordinator) ignore CacheServerLoadMessage messages that are carrying the receiver's connection load. This means that locators will not adjust the load when the connection on some receiver is shut down. Is this expected behavior or this is a bug? You can find more information in this PR: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fpull%2F7378%23issuecomment-1048513322&data=04%7C01%7Cboglesby%40vmware.com%7C03d67ae56c2d40ec3d3a08da02777550%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637825009351536446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=AZhsJhWSNNVRZHQWaKN6QPw%2B4I8c4lMDgSFATS%2F5vEE%3D&reserved=0 and ticket: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-10056&data=04%7C01%7Cboglesby%40vmware.com%7C03d67ae56c2d40ec3d3a08da02777550%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637825009351536446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=XyfNgfYfg2LSNE2HeFlLtnmZ09cLQtx%2FjU5jv7T1qwE%3D&reserved=0 Thanks, Jakov