Right now every single IntegrationTest or DistributedTest that starts a
server and then starts a client is at risk of hitting this risk and
flickering because the CacheServer isn't truly ready after the invocation
to CacheServer.start() returns. And you can see these failing
intermittently on https://concourse.apachegeode-ci.info.

For example, see this failure:
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/301

On Thu, May 3, 2018 at 11:03 AM, Kirk Lund <kl...@apache.org> wrote:

> Anil and I were hoping that adding line 4 below would remove the race, but
> it doesn't. CacheServer.isRunning() returns true as soon as AcceptorImpl is
> non-null which is still before ServerSocket.accept() has been invoked. The
> race still exists with this...
>
> 1: CacheServer cacheServer = getCache().addCacheServer();
> 2: cacheServer.setPort(0);
> 3: cacheServer.start();
> *4: await().atMost(1, MINUTES).until(() -> cacheServer.isRunning());*
> 5: return cacheServer.getPort();
>
> I think we would have to change the implementation of
> CacheServer.isRunning() to check ServerSocket.isBound():
>
> CacheServer.isRunning:
>   public boolean isRunning() {
>     return this.acceptor != null && this.acceptor.isRunning();
>   }
>
> AcceptorImpl.isRunning:
>   public boolean isRunning() {
>     return !this.shutdownStarted;
>   }
>
> Would need to change to something like this:
>   public boolean isRunning() {
>     return !this.shutdownStarted *&& serverSock.isBound();*
>   }
>
> Any opinions or alternatives? If I add "*&& serverSock.isBound();*" am I
> going to break isRunning?
>
> On Thu, May 3, 2018 at 9:33 AM, Kirk Lund <kl...@apache.org> wrote:
>
>> I have a test which starts a server and then starts a client. But the
>> client intermittently fails with NoSubscriptionServersAvailableException
>> (see full stack below).
>>
>> Seems like there must be something asynchronous in the startup of a
>> CacheServer that I need to wait for. Any ideas what I need to test for to
>> avoid NoSubscriptionServersAvailableException?
>>
>> org.apache.geode.internal.cache.InitializedDiskRegionWithIoExceptionRegressionTest
>> > cacheServerPersistWithIOExceptionShouldShutdown FAILED
>>     org.apache.geode.test.dunit.RMIException: While invoking
>> org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
>> xceptionRegressionTest$$Lambda$23/1222369873.run in VM 1 running on Host
>> 0b1780a0efc9 with 4 VMs
>>         at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
>>         at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
>>         at org.apache.geode.test.dunit.VM.invoke(VM.java:348)
>>         at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
>> xceptionRegressionTest.cacheServerPersistWithIOExceptionShou
>> ldShutdown(InitializedDiskRegionWithIoExceptionRegressionTest.java:113)
>>
>> Caused by:
>> org.apache.geode.cache.NoSubscriptionServersAvailableException:
>> org.apache.geode.cache.NoSubscriptionServersAvailableException: Could
>> not initialize a primary queue on startup. No queue servers available.
>>     at org.apache.geode.cache.client.internal.QueueManagerImpl.getA
>> llConnections(QueueManagerImpl.java:187)
>>     at org.apache.geode.cache.client.internal.OpExecutorImpl.execut
>> eOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:539)
>>     at org.apache.geode.cache.client.internal.PoolImpl.executeOnQue
>> uesAndReturnPrimaryResult(PoolImpl.java:850)
>>     at org.apache.geode.cache.client.internal.RegisterInterestOp.ex
>> ecute(RegisterInterestOp.java:58)
>>     at org.apache.geode.cache.client.internal.ServerRegionProxy.reg
>> isterInterest(ServerRegionProxy.java:356)
>>     at org.apache.geode.internal.cache.LocalRegion.processSingleInt
>> erest(LocalRegion.java:3749)
>>     at org.apache.geode.internal.cache.LocalRegion.registerInterest
>> (LocalRegion.java:3840)
>>     at org.apache.geode.internal.cache.LocalRegion.registerInterest
>> (LocalRegion.java:3638)
>>     at org.apache.geode.internal.cache.LocalRegion.registerInterest
>> (LocalRegion.java:3633)
>>     at org.apache.geode.internal.cache.LocalRegion.registerInterest
>> (LocalRegion.java:3628)
>>     at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
>> xceptionRegressionTest.createClientCache(InitializedDiskRegi
>> onWithIoExceptionRegressionTest.java:172)
>>     at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
>> xceptionRegressionTest.lambda$cacheServerPersistWithIOExcept
>> ionShouldShutdown$2c6907a2$1(InitializedDiskRegionWithIoExce
>> ptionRegressionTest.java:113)
>>
>> Caused by:
>> org.apache.geode.cache.NoSubscriptionServersAvailableException: Could
>> not initialize a primary queue on startup. No queue servers available.
>>     at org.apache.geode.cache.client.internal.QueueManagerImpl.init
>> ializeConnections(QueueManagerImpl.java:585)
>>     at org.apache.geode.cache.client.internal.QueueManagerImpl.star
>> t(QueueManagerImpl.java:296)
>>     at org.apache.geode.cache.client.internal.PoolImpl.start(PoolIm
>> pl.java:347)
>>     at org.apache.geode.cache.client.internal.PoolImpl.finishCreate
>> (PoolImpl.java:172)
>>     at org.apache.geode.cache.client.internal.PoolImpl.create(PoolI
>> mpl.java:158)
>>     at org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolF
>> actoryImpl.java:346)
>>     at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
>> xceptionRegressionTest.createClientCache(InitializedDiskRegi
>> onWithIoExceptionRegressionTest.java:164)
>>     ... 1 more
>>
>>
>

Reply via email to