That smells like a bug to me. We don't have to be in accept() though -
we just need to have the serversocket open and with sufficient backlog
to queue the connection requests. Adding a check for whether the socket
is open in isRunning() might be good enough.
On 5/3/18 11:26 AM, Kirk Lund wrote:
The method CacheServer.isRunning() is actually invoked from within the
AcceptorImpl.accept():
while (isRunning()) {
...
try {
socket = serverSock.accept();
So we can't modify isRunning() to check serverSock.isBound(). We would have
to introduce a new User API on CacheServer called isBound() or something
like that:
org.apache.geode.cache.server.CacheServer
*/***
* * Returns true if this cache server is bound to its port and accepting
connections.*
* */*
*boolean isBound();*
Any objections or suggestions?
On Thu, May 3, 2018 at 11:03 AM, Kirk Lund <kl...@apache.org> wrote:
Anil and I were hoping that adding line 4 below would remove the race, but
it doesn't. CacheServer.isRunning() returns true as soon as AcceptorImpl is
non-null which is still before ServerSocket.accept() has been invoked. The
race still exists with this...
1: CacheServer cacheServer = getCache().addCacheServer();
2: cacheServer.setPort(0);
3: cacheServer.start();
*4: await().atMost(1, MINUTES).until(() -> cacheServer.isRunning());*
5: return cacheServer.getPort();
I think we would have to change the implementation of
CacheServer.isRunning() to check ServerSocket.isBound():
CacheServer.isRunning:
public boolean isRunning() {
return this.acceptor != null && this.acceptor.isRunning();
}
AcceptorImpl.isRunning:
public boolean isRunning() {
return !this.shutdownStarted;
}
Would need to change to something like this:
public boolean isRunning() {
return !this.shutdownStarted *&& serverSock.isBound();*
}
Any opinions or alternatives? If I add "*&& serverSock.isBound();*" am I
going to break isRunning?
On Thu, May 3, 2018 at 9:33 AM, Kirk Lund <kl...@apache.org> wrote:
I have a test which starts a server and then starts a client. But the
client intermittently fails with NoSubscriptionServersAvailableException
(see full stack below).
Seems like there must be something asynchronous in the startup of a
CacheServer that I need to wait for. Any ideas what I need to test for to
avoid NoSubscriptionServersAvailableException?
org.apache.geode.internal.cache.InitializedDiskRegionWithIoExceptionRegressionTest
cacheServerPersistWithIOExceptionShouldShutdown FAILED
org.apache.geode.test.dunit.RMIException: While invoking
org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
xceptionRegressionTest$$Lambda$23/1222369873.run in VM 1 running on Host
0b1780a0efc9 with 4 VMs
at org.apache.geode.test.dunit.VM.invoke(VM.java:436)
at org.apache.geode.test.dunit.VM.invoke(VM.java:405)
at org.apache.geode.test.dunit.VM.invoke(VM.java:348)
at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
xceptionRegressionTest.cacheServerPersistWithIOExceptionShou
ldShutdown(InitializedDiskRegionWithIoExceptionRegressionTest.java:113)
Caused by:
org.apache.geode.cache.NoSubscriptionServersAvailableException:
org.apache.geode.cache.NoSubscriptionServersAvailableException: Could
not initialize a primary queue on startup. No queue servers available.
at org.apache.geode.cache.client.internal.QueueManagerImpl.getA
llConnections(QueueManagerImpl.java:187)
at org.apache.geode.cache.client.internal.OpExecutorImpl.execut
eOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:539)
at org.apache.geode.cache.client.internal.PoolImpl.executeOnQue
uesAndReturnPrimaryResult(PoolImpl.java:850)
at org.apache.geode.cache.client.internal.RegisterInterestOp.ex
ecute(RegisterInterestOp.java:58)
at org.apache.geode.cache.client.internal.ServerRegionProxy.reg
isterInterest(ServerRegionProxy.java:356)
at org.apache.geode.internal.cache.LocalRegion.processSingleInt
erest(LocalRegion.java:3749)
at org.apache.geode.internal.cache.LocalRegion.registerInterest
(LocalRegion.java:3840)
at org.apache.geode.internal.cache.LocalRegion.registerInterest
(LocalRegion.java:3638)
at org.apache.geode.internal.cache.LocalRegion.registerInterest
(LocalRegion.java:3633)
at org.apache.geode.internal.cache.LocalRegion.registerInterest
(LocalRegion.java:3628)
at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
xceptionRegressionTest.createClientCache(InitializedDiskRegi
onWithIoExceptionRegressionTest.java:172)
at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
xceptionRegressionTest.lambda$cacheServerPersistWithIOExcept
ionShouldShutdown$2c6907a2$1(InitializedDiskRegionWithIoExce
ptionRegressionTest.java:113)
Caused by:
org.apache.geode.cache.NoSubscriptionServersAvailableException: Could
not initialize a primary queue on startup. No queue servers available.
at org.apache.geode.cache.client.internal.QueueManagerImpl.init
ializeConnections(QueueManagerImpl.java:585)
at org.apache.geode.cache.client.internal.QueueManagerImpl.star
t(QueueManagerImpl.java:296)
at org.apache.geode.cache.client.internal.PoolImpl.start(PoolIm
pl.java:347)
at org.apache.geode.cache.client.internal.PoolImpl.finishCreate
(PoolImpl.java:172)
at org.apache.geode.cache.client.internal.PoolImpl.create(PoolI
mpl.java:158)
at org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolF
actoryImpl.java:346)
at org.apache.geode.internal.cache.InitializedDiskRegionWithIoE
xceptionRegressionTest.createClientCache(InitializedDiskRegi
onWithIoExceptionRegressionTest.java:164)
... 1 more