[ https://issues.apache.org/jira/browse/GEODE-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293782#comment-17293782 ]
ASF GitHub Bot commented on GEODE-8442: --------------------------------------- pdxcodemonkey commented on pull request #713: URL: https://github.com/apache/geode-native/pull/713#issuecomment-788999841 The test output is a little tricky to see clearly, but the issue in this PR is not `RegisterKeysTest.RegisterAllWithConsistencyDisabled`. If you grep for it, you see the following: ``` Start 55: RegisterKeysTest.RegisterAllWithConsistencyDisabled 18/78 Test #55: RegisterKeysTest.RegisterAllWithConsistencyDisabled ..................................***Failed 152.70 sec Note: Google Test filter = RegisterKeysTest.RegisterAllWithConsistencyDisabled [ RUN ] RegisterKeysTest.RegisterAllWithConsistencyDisabled [2021-03-01 12:46:10.389389] [0x00002d58] [info] Gfsh::execute: start locator --log-level=INFO --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 --name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 --bind-address=localhost --port=50311 --max-heap=256m --J=-Dgemfire.jmx-manager-port=50310 --http-service-port=0 --J=-Dgemfire.jmx-manager-start=true [2021-03-01 12:46:15.701819] [0x00002d58] [trace] Gfsh::execute: (1) Executing - start locator --log-level=INFO --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 --name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 --bind-address=localhost --port=50311 --max-heap=256m --J=-Dgemfire.jmx-manager-port=50310 --http-service-port=0 --J=-Dgemfire.jmx-manager-start=true [2021-03-01 12:46:33.904986] [0x00002d58] [trace] Gfsh::execute: Locator in C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\locator\0 on 127.0.0.1[50311] as RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 is currently online. [2021-03-01 12:46:33.904986] [0x00002d58] [trace] Gfsh::execute: Log File: C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\locator\0\RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0.log [2021-03-01 12:46:44.793326] [0x00002d58] [info] Gfsh::execute: start server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 --name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 --bind-address=localhost --server-port=0 --max-heap=1g --locators=localhost[50311] [2021-03-01 12:46:55.904964] [0x00002d58] [trace] Gfsh::execute: (2) Executing - start server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 --name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 --bind-address=localhost --server-port=0 --max-heap=1g --locators=localhost[50311] [2021-03-01 12:47:10.389461] [0x00002d58] [trace] Gfsh::execute: Server in C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\server\0 on build-7876be09-a95c-4725-b2d3-4cb99aabd716.c.apachegeode-ci.internal[50533] as RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 is currently online. [2021-03-01 12:47:10.389461] [0x00002d58] [trace] Gfsh::execute: Log File: C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\server\0\RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0.log [2021-03-01 12:47:24.217552] [0x000028ec] [trace] Gfsh::execute: RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 | OK | Region "/region" created o.. [2021-03-01 12:48:28.451760] [0x00001b0c] [info] Gfsh::execute: stop server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 [2021-03-01 12:48:28.451760] [0x000013a4] [info] Gfsh::execute: stop locator --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 [2021-03-01 12:48:39.288256] [0x00001b0c] [trace] Gfsh::execute: (2) Executing - stop server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 [2021-03-01 12:48:39.320492] [0x000013a4] [trace] Gfsh::execute: (2) Executing - stop locator --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 [ FAILED ] RegisterKeysTest.RegisterAllWithConsistencyDisabled (152546 ms) [ FAILED ] RegisterKeysTest.RegisterAllWithConsistencyDisabled Start 55: RegisterKeysTest.RegisterAllWithConsistencyDisabled Test #55: RegisterKeysTest.RegisterAllWithConsistencyDisabled .................................. Passed 96.14 sec ``` So, that test re-ran and passed the second time. The problem is that the run is timing out because one test is stuck. I have a rudimentary Python script that matches the "Start X:" output lines with "Test #X:" result lines, that shows which ones are unmatched, i.e. never completed. Here's the output from that: ``` Passed: AuthInitializeTest.putGetWithBasicAuth (#1) AuthInitializeTest.putWithBadUsername (#2) AuthInitializeTest.putWithBadPassword (#3) AuthInitializeTest.badCredentialsWithSubscriptionEnabled (#4) CacheWriterTest.WriteInRegionFromCacheWriterTriggeredByTransaction (#6) CacheXmlTest.loadCacheXml (#7) CacheXmlTest.loadCacheXmlWithBadSchema (#8) CacheXmlTest.loadCacheWithUnnamedPool (#9) CacheXmlTest.testApplicationPartitionResolver (#10) CacheXmlTest.testSharedLibPartitionResolver (#11) CleanIdleConnectionsTest.cleanIdleConnectionsAfterOpsPaused (#12) ClientTransactionXATest.interTxand2PCTx (#13) CommitConflictExceptionTest.putPartitionTx (#14) CqPlusAuthInitializeTest.putInALoopWhileSubscribedAndAuthenticated (#15) CqTest.testCqCreateUpdateDestroy (#16) DataSerializableTest.isSerializableAndDeserializable (#17) DataSerializableTest.ClassAsKey (#18) ChunkHandlerThreadTest.isDisabledUsesSameThread (#19) ChunkHandlerThreadTest.isEnabledUsesDifferentThread (#20) ExceptionTranslationTest.testLowMemoryException (#23) ExceptionTranslationTest.testQueryLowMemoryException (#24) ExpirationTest.verifyExpiration (#25) FunctionExecutionTest.UnknownFunctionOnServer (#26) FunctionExecutionTest.UnknownFunctionOnRegion (#27) FunctionExecutionTest.UnknownFunctionAsyncOnServer (#28) FunctionExecutionTest.UnknownFunctionAsyncOnRegion (#29) FunctionExecutionTest.FunctionReturnsObjectWhichCantBeDeserializedOnServer (#30) FunctionExecutionTest.OnServersWithReplicatedRegionsInPool (#31) FunctionExecutionTest.OnServersOneServerGoesDown (#32) LRUEvictionTest.heapLruLimit (#33) LRUEvictionTest.lruEntriesLimit (#34) LRUEvictionTest.verifyLruOrderGet (#35) LRUEvictionTest.verifyLruOrderInvalidate (#36) LRUEvictionTest.verifyLruOrderDestroy (#37) LocatorRequestsTest.verifyNoInterlock (#38) PartitionRegionOpsTest.getPartitionedRegionWithRedundancyServerGoesDownSingleHop (#39) PartitionRegionOpsTest.putPartitionedRegionWithRedundancyServerGoesDownSingleHop (#40) PartitionRegionOpsTest.getPartitionedRegionWithRedundancyServerGoesDownNoSingleHop (#41) PartitionRegionOpsTest.putPartitionedRegionWithRedundancyServerGoesDownNoSingleHop (#42) PdxInstanceFactoryTest.testConcurrentCreateCalls (#43) PdxInstanceTest.testPdxInstance (#44) PdxInstanceTest.testCreateJsonInstance (#45) PdxJsonTypeTest.testGfshQueryJsonInstances (#46) PdxJsonTypeTest.testCreateTwoJsonInstances (#47) PdxJsonTypeTest.testTwoConsecutiveGets (#48) PdxSerializerTest.canSerializeNonPdxSerializableType (#49) RegionGetAllTest.getAllFromPartitionedRegion (#50) RegionPutAllTest.putAllToPartitionedRegion (#51) RegionPutGetAllTest.variousPdxTypes (#52) RegionPutGetAllTest.nullValue (#53) RegisterKeysTest.RegisterAllWithCachingRegion (#54) RegisterKeysTest.RegisterAllWithConsistencyDisabled (#55) RegisterKeysTest.RegisterAnyWithCachingRegion (#56) RegisterKeysTest.RegisterAllWithProxyRegion (#57) RegisterKeysTest.RegisterAnyWithProxyRegion (#58) ServerDisconnect.WithRegionDisconnectedListener (#59) SslOneWayTest.PutGetWithValidSslConfiguration (#60) SslOneWayTest.PutWithUntrustedTruststore (#61) SslOneWayTest.PutWithCorruptTruststore (#62) SslOneWayTest.PutWithMissingTruststore (#63) SslTwoWayTest.PutGetWithValidSslConfiguration (#64) SslTwoWayTest.PutWithInvalidKeystorePassword (#65) SslTwoWayTest.PutWithUntrustedKeystore (#66) SslTwoWayTest.PutWithCorruptKeystore (#67) SslTwoWayTest.PutWithUntrustedTruststore (#68) SslTwoWayTest.PutWithCorruptTruststore (#69) SslTwoWayTest.PutWithMissingTruststore (#70) StructTest.queryResultForRange (#71) TransactionCleaningTest.txWithStoppedServer (#72) TransactionsTest.ExceptionWhenRollingBackTx (#73) TransactionsTest.IlegalStateExceptionNoTx (#74) TransactionsTest.ExceptionConflictOnPrepare (#75) TransactionsTest.ExceptionConflictOnCommit (#76) WanDeserializationTest.testEventsAreDeserializedCorrectly (#77) Failed: RegisterKeysTest.RegisterAllWithConsistencyDisabled (#55) Timed Out: None Not Run: BasicIPv6Test.queryResultForRange (#5) ExampleTest.putAndGetWith2Servers (#21) ExampleTest.putGetAndUpdateWith1Server (#22) Stuck: ContainsKeyOnServerExceptionTest.handleException (#78) Tests Passed: 74/76 (97%) Tests Failed: 1/76 (1%) Tests Timed Out: 0/76 (0%) Tests Stuck: 1/76 (1%) Tests Not Run: 3/79 (4%) ``` So it appears the new test for this PR, `ContainsKeyOnServerExceptionTest.handleException`, hangs on Windows. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Exception in server not identified correctly in client > ------------------------------------------------------ > > Key: GEODE-8442 > URL: https://issues.apache.org/jira/browse/GEODE-8442 > Project: Geode > Issue Type: Bug > Components: native client > Reporter: Alberto Bustamante Reyes > Assignee: Mario Kevo > Priority: Major > Labels: needs-review, pull-request-available > > Native client is not identifying correctly an exception returned by the > server. > This is a log from an exception properly identified (I have introduced "????" > where I think I have to hide sensitive information): > {code} > [error 2020/07/22 09:17:10.831337 UTC ????] > CacheTransactionManager::failover: An exception > (org.apache.geode.cache.TransactionDataNodeHasDepartedException: Could not > find transaction host for TXId: > ????(default_GeodeDS:1:loner):3:Native_gdbdedfebe1:default_GeodeDS:6022 > at > org.apache.geode.internal.cache.tier.sockets.command.TXFailoverCommand.cmdExecute(TXFailoverCommand.java:126) > at > org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177) > at > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848) > at > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72) > at > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) > at java.base/java.lang.Thread.run(Thread.java:834) > ) happened at remote server. > data-access: write_element:[TransactionDataNodeHasDepartedException] > ????.get(????)::apache::geode::client::TransactionDataNodeHasDepartedException:org.apache.geode.cache.TransactionDataNodeHasDepartedException: > Could not find transaction host for TXId: > ????(default_GeodeDS:1:loner):3:Native_gdbdedfebe1:default_GeodeDS:6022 > {code} > But in this case, although the exception received is also a > TransactionDataNodeHasDepartedException, it is "translated" to an Unknown > exception: > {code} > [error 2020/07/22 09:17:10.979506 UTC ????] Region::containsKeyOnServer:: An > exception (org.apache.geode.cache.TransactionDataNodeHasDepartedException: > Transaction data node 192.168.240.13(dms-server-1:1)<v1>:41000 has departed. > To proceed, rollback this transaction and begin a new one., caused by > org.apache.geode.internal.cache.ForceReattemptException: PartitionResponse > got remote CacheClosedException > at > org.apache.geode.internal.cache.tx.PartitionedTXRegionStub.containsKey(PartitionedTXRegionStub.java:247) > at > org.apache.geode.internal.cache.TXStateStub.containsKey(TXStateStub.java:431) > at > org.apache.geode.internal.cache.TXStateProxyImpl.containsKey(TXStateProxyImpl.java:530) > at > org.apache.geode.internal.cache.PartitionedRegion.containsKey(PartitionedRegion.java:6402) > at > org.apache.geode.internal.cache.tier.sockets.command.ContainsKey66.cmdExecute(ContainsKey66.java:138) > at > org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177) > at > org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848) > at > org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72) > at > org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676) > at > org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: org.apache.geode.internal.cache.ForceReattemptException: > PartitionResponse got remote CacheClosedException > at > org.apache.geode.internal.cache.partitioned.PartitionMessage$PartitionResponse.waitForCacheException(PartitionMessage.java:837) > at > org.apache.geode.internal.cache.partitioned.ContainsKeyValueMessage$ContainsKeyValueResponse.waitForContainsResult(ContainsKeyValueMessage.java:295) > at > org.apache.geode.internal.cache.PartitionedRegion.containsKeyRemotely(PartitionedRegion.java:6558) > at > org.apache.geode.internal.cache.tx.PartitionedTXRegionStub.containsKey(PartitionedTXRegionStub.java:231) > ... 13 more > Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is > closed: 192.168.240.13(dms-server-1:1)<v1>:41000 > at > org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1630) > at > org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1619) > at > org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:307) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442) > at > org.apache.geode.distributed.internal.ClusterOperationExecutors.doPartitionRegionThread(ClusterOperationExecutors.java:422) > ... 2 more > ) happened at remote server. > [info 2020/07/22 09:17:10.979514 UTC ????] error code: 12 > [fine 2020/07/22 09:17:10.979869 UTC ????] sendSyncRequest:: retry = 0 > [fine 2020/07/22 09:17:10.979874 UTC ????] sendRequestConnWithRetry:: > maxSendRetries = 1 > data-access: write_element:[Exception] > SmSessions.containsKeyOnServer(????):apache::geode::client::UnknownException:Replay > is unsupported: > org.apache.geode.cache.TransactionDataNodeHasDepartedException: Transaction > data node 192.168.240.13(????)<v1>:41000 has departed. To proceed, rollback > this transaction and begin a new one, caused by > org.apache.geode.internal.cache.ForceReattemptException: PartitionResponse > got remote CacheClosedException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)