[ https://issues.apache.org/jira/browse/GEODE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anilkumar Gingade updated GEODE-9664: ------------------------------------- Labels: (was: needsTriage) > Two different clients with the same durable id will both connect to the > servers and receive messages > ---------------------------------------------------------------------------------------------------- > > Key: GEODE-9664 > URL: https://issues.apache.org/jira/browse/GEODE-9664 > Project: Geode > Issue Type: Bug > Components: client queues > Reporter: Barrett Oglesby > Priority: Major > > There are two cases: > # The number of queues is the same as the number of servers (e.g. client > with subscription-redundancy=1 and 2 servers) > # The number of queues is less than the number of servers (e.g. client with > subscription-redundancy=0 and 2 servers) > h2. Case 1 > In this case, the client first attempts to connect to the primary and fails. > {noformat} > [warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal about to > register > clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]) > [warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing > proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]); port=61581; primary=true; version=GEODE 1.15.0] > [warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing > proxy isPaused=false > [warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x4b] The requested durable client has the same identifier ( > client-a ) as an existing durable client ( > CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]); port=61581; primary=true; version=GEODE 1.15.0] ). Duplicate > durable clients are not allowed. > [warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x4b] CacheClientNotifier: Unsuccessfully registered client > with identifier > identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]) and response code 64 > {noformat} > It then attempts to connect to the secondary and succeeds. > {noformat} > [warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization > Thread 1> tid=0x47] XXX CacheClientNotifier.registerClientInternal about to > register > clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]) > [warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization > Thread 1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing > proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]); port=61578; primary=false; version=GEODE 1.15.0] > [warn 2021/10/01 14:37:56.216 PDT server-2 <Client Queue Initialization > Thread 1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing > proxy isPaused=true > [warn 2021/10/01 14:37:56.217 PDT server-2 <Client Queue Initialization > Thread 1> tid=0x47] XXX CacheClientNotifier.registerClientInternal > reinitialized existing > proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]); port=61578; primary=true; version=GEODE 1.15.0] > {noformat} > The previous secondary is reinitialized and made into a primary. Both queues > will dispatch events. > The CacheClientNotifier.registerClientInternal method invoked when a client > connects does: > {noformat} > if (cacheClientProxy.isPaused()) { > ... > cacheClientProxy.reinitialize(...); > } else { > unsuccessfulMsg = String.format("The requested durable client has the same > identifier ( %s ) as an existing durable client...); > logger.warn(unsuccessfulMsg); > } > {noformat} > The CacheClientProxy is paused when the durable client it represents has > disconnected. Unfortunately, a secondary CacheClientProxy is also paused. So, > this check is not good enough to prevent a duplicate durable client from > connecting. > There are a few things that can also be checked. One of them is: > {noformat} > cacheClientProxy.getCommBuffer() == null > {noformat} > With that check added, when the client attempts to connect to the secondary, > it fails just like the it does with the primary. > The client then exits with this exception: > {noformat} > geode.cache.NoSubscriptionServersAvailableException: > org.apache.geode.cache.NoSubscriptionServersAvailableException: Could not > initialize a primary queue on startup. No queue servers available. > at > org.apache.geode.cache.client.internal.QueueManagerImpl.getAllConnections(QueueManagerImpl.java:191) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:428) > at > org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:875) > at > org.apache.geode.cache.client.internal.RegisterInterestOp.execute(RegisterInterestOp.java:58) > at > org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterest(ServerRegionProxy.java:364) > at > org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3815) > at > org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3911) > at > org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3890) > at > org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3885) > at > org.apache.geode.cache.Region.registerInterestForAllKeys(Region.java:1709) > Caused by: org.apache.geode.cache.NoSubscriptionServersAvailableException: > Could not initialize a primary queue on startup. No queue servers available. > at > org.apache.geode.cache.client.internal.QueueManagerImpl.initializeConnections(QueueManagerImpl.java:575) > at > org.apache.geode.cache.client.internal.QueueManagerImpl.start(QueueManagerImpl.java:293) > at > org.apache.geode.cache.client.internal.PoolImpl.start(PoolImpl.java:359) > at > org.apache.geode.cache.client.internal.PoolImpl.finishCreate(PoolImpl.java:183) > at > org.apache.geode.cache.client.internal.PoolImpl.create(PoolImpl.java:169) > at > org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolFactoryImpl.java:378) > {noformat} > h2. Case 2 > In this case, the client first attempts to connect to the primary and fails > just like case 1. > It will then attempt to connect a server with no existing queue and succeed. > {noformat} > [warn 2021/10/01 15:02:50.798 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x54] XXX CacheClientNotifier.registerClientInternal about to > register > clientProxyMembershipID=identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]) > [warn 2021/10/01 15:02:50.799 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x54] XXX CacheClientNotifier.registerClientInternal existing > proxy=null > [warn 2021/10/01 15:02:50.810 PDT server-1 <Client Queue Initialization > Thread 1> tid=0x54] XXX CacheClientNotifier.registerClientInternal created > proxy=CacheClientProxy[identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a; > timeout=300]); port=62094; primary=true; version=GEODE 1.15.0] > {noformat} > One way to address this case would be to prevent the durable client from > retrying to another server if it can't connect to the primary. > That would have be addressed in QueueManagerImpl.initializeConnections. That > method would have to know that the server refused the connection > (ServerRefusedConnectionException) and then return out of that method. > Thats a bit more work since that method currently doesn't get any exceptions > from initializeQueueConnection which does: > {noformat} > } catch (Exception e) { > if (logger.isDebugEnabled()) { > logger.debug("error creating subscription connection to server {}", > connection.getEndpoint(), e); > } > } > {noformat} > An exception handler like this would need to be added: > {noformat} > } catch (ServerRefusedConnectionException e) { > throw e; > } > {noformat} > QueueManagerImpl.initializeConnections would have to handle that exception in > a few places. I'm not sure exactly what should be done in that method. -- This message was sent by Atlassian Jira (v8.20.1#820001)