[ 
https://issues.apache.org/jira/browse/GEODE-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293782#comment-17293782
 ] 

ASF GitHub Bot commented on GEODE-8442:
---------------------------------------

pdxcodemonkey commented on pull request #713:
URL: https://github.com/apache/geode-native/pull/713#issuecomment-788999841


   The test output is a little tricky to see clearly, but the issue in this PR 
is not `RegisterKeysTest.RegisterAllWithConsistencyDisabled`.  If you grep for 
it, you see the following:
   
   ```
         Start 55: RegisterKeysTest.RegisterAllWithConsistencyDisabled
   18/78 Test #55: RegisterKeysTest.RegisterAllWithConsistencyDisabled 
..................................***Failed  152.70 sec
   Note: Google Test filter = 
RegisterKeysTest.RegisterAllWithConsistencyDisabled
   [ RUN      ] RegisterKeysTest.RegisterAllWithConsistencyDisabled
   [2021-03-01 12:46:10.389389] [0x00002d58] [info]    Gfsh::execute: start 
locator --log-level=INFO 
--dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 
--name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 
--bind-address=localhost --port=50311 --max-heap=256m 
--J=-Dgemfire.jmx-manager-port=50310 --http-service-port=0 
--J=-Dgemfire.jmx-manager-start=true
   [2021-03-01 12:46:15.701819] [0x00002d58] [trace]   Gfsh::execute: (1) 
Executing - start locator --log-level=INFO 
--dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0 
--name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 
--bind-address=localhost --port=50311 --max-heap=256m 
--J=-Dgemfire.jmx-manager-port=50310 --http-service-port=0 
--J=-Dgemfire.jmx-manager-start=true
   [2021-03-01 12:46:33.904986] [0x00002d58] [trace]   Gfsh::execute: Locator 
in 
C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\locator\0
 on 127.0.0.1[50311] as 
RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0 is currently 
online.
   [2021-03-01 12:46:33.904986] [0x00002d58] [trace]   Gfsh::execute: Log File: 
C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\locator\0\RegisterKeysTest_RegisterAllWithConsistencyDisabled_locator_0.log
   [2021-03-01 12:46:44.793326] [0x00002d58] [info]    Gfsh::execute: start 
server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 
--name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 
--bind-address=localhost --server-port=0 --max-heap=1g 
--locators=localhost[50311]
   [2021-03-01 12:46:55.904964] [0x00002d58] [trace]   Gfsh::execute: (2) 
Executing - start server 
--dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0 
--name=RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 
--bind-address=localhost --server-port=0 --max-heap=1g 
--locators=localhost[50311]
   [2021-03-01 12:47:10.389461] [0x00002d58] [trace]   Gfsh::execute: Server in 
C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\server\0
 on build-7876be09-a95c-4725-b2d3-4cb99aabd716.c.apachegeode-ci.internal[50533] 
as RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 is currently 
online.
   [2021-03-01 12:47:10.389461] [0x00002d58] [trace]   Gfsh::execute: Log File: 
C:\Users\build\build\cppcache\integration\test\RegisterKeysTest\RegisterAllWithConsistencyDisabled\server\0\RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0.log
   [2021-03-01 12:47:24.217552] [0x000028ec] [trace]   Gfsh::execute: 
RegisterKeysTest_RegisterAllWithConsistencyDisabled_server_0 | OK     | Region 
"/region" created o..
   [2021-03-01 12:48:28.451760] [0x00001b0c] [info]    Gfsh::execute: stop 
server --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0
   [2021-03-01 12:48:28.451760] [0x000013a4] [info]    Gfsh::execute: stop 
locator --dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0
   [2021-03-01 12:48:39.288256] [0x00001b0c] [trace]   Gfsh::execute: (2) 
Executing - stop server 
--dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/server/0
   [2021-03-01 12:48:39.320492] [0x000013a4] [trace]   Gfsh::execute: (2) 
Executing - stop locator 
--dir=RegisterKeysTest/RegisterAllWithConsistencyDisabled/locator/0
   [  FAILED  ] RegisterKeysTest.RegisterAllWithConsistencyDisabled (152546 ms)
   [  FAILED  ] RegisterKeysTest.RegisterAllWithConsistencyDisabled
         Start 55: RegisterKeysTest.RegisterAllWithConsistencyDisabled
       Test #55: RegisterKeysTest.RegisterAllWithConsistencyDisabled 
..................................   Passed   96.14 sec
   ```
   
   So, that test re-ran and passed the second time.  The problem is that the 
run is timing out because one test is stuck.  I have a rudimentary Python 
script that matches the "Start X:" output lines with "Test #X:" result lines, 
that shows which ones are unmatched, i.e. never completed.  Here's the output 
from that:
   
   ```
   Passed:
       AuthInitializeTest.putGetWithBasicAuth (#1)
       AuthInitializeTest.putWithBadUsername (#2)
       AuthInitializeTest.putWithBadPassword (#3)
       AuthInitializeTest.badCredentialsWithSubscriptionEnabled (#4)
       CacheWriterTest.WriteInRegionFromCacheWriterTriggeredByTransaction (#6)
       CacheXmlTest.loadCacheXml (#7)
       CacheXmlTest.loadCacheXmlWithBadSchema (#8)
       CacheXmlTest.loadCacheWithUnnamedPool (#9)
       CacheXmlTest.testApplicationPartitionResolver (#10)
       CacheXmlTest.testSharedLibPartitionResolver (#11)
       CleanIdleConnectionsTest.cleanIdleConnectionsAfterOpsPaused (#12)
       ClientTransactionXATest.interTxand2PCTx (#13)
       CommitConflictExceptionTest.putPartitionTx (#14)
       CqPlusAuthInitializeTest.putInALoopWhileSubscribedAndAuthenticated (#15)
       CqTest.testCqCreateUpdateDestroy (#16)
       DataSerializableTest.isSerializableAndDeserializable (#17)
       DataSerializableTest.ClassAsKey (#18)
       ChunkHandlerThreadTest.isDisabledUsesSameThread (#19)
       ChunkHandlerThreadTest.isEnabledUsesDifferentThread (#20)
       ExceptionTranslationTest.testLowMemoryException (#23)
       ExceptionTranslationTest.testQueryLowMemoryException (#24)
       ExpirationTest.verifyExpiration (#25)
       FunctionExecutionTest.UnknownFunctionOnServer (#26)
       FunctionExecutionTest.UnknownFunctionOnRegion (#27)
       FunctionExecutionTest.UnknownFunctionAsyncOnServer (#28)
       FunctionExecutionTest.UnknownFunctionAsyncOnRegion (#29)
       
FunctionExecutionTest.FunctionReturnsObjectWhichCantBeDeserializedOnServer (#30)
       FunctionExecutionTest.OnServersWithReplicatedRegionsInPool (#31)
       FunctionExecutionTest.OnServersOneServerGoesDown (#32)
       LRUEvictionTest.heapLruLimit (#33)
       LRUEvictionTest.lruEntriesLimit (#34)
       LRUEvictionTest.verifyLruOrderGet (#35)
       LRUEvictionTest.verifyLruOrderInvalidate (#36)
       LRUEvictionTest.verifyLruOrderDestroy (#37)
       LocatorRequestsTest.verifyNoInterlock (#38)
       
PartitionRegionOpsTest.getPartitionedRegionWithRedundancyServerGoesDownSingleHop
 (#39)
       
PartitionRegionOpsTest.putPartitionedRegionWithRedundancyServerGoesDownSingleHop
 (#40)
       
PartitionRegionOpsTest.getPartitionedRegionWithRedundancyServerGoesDownNoSingleHop
 (#41)
       
PartitionRegionOpsTest.putPartitionedRegionWithRedundancyServerGoesDownNoSingleHop
 (#42)
       PdxInstanceFactoryTest.testConcurrentCreateCalls (#43)
       PdxInstanceTest.testPdxInstance (#44)
       PdxInstanceTest.testCreateJsonInstance (#45)
       PdxJsonTypeTest.testGfshQueryJsonInstances (#46)
       PdxJsonTypeTest.testCreateTwoJsonInstances (#47)
       PdxJsonTypeTest.testTwoConsecutiveGets (#48)
       PdxSerializerTest.canSerializeNonPdxSerializableType (#49)
       RegionGetAllTest.getAllFromPartitionedRegion (#50)
       RegionPutAllTest.putAllToPartitionedRegion (#51)
       RegionPutGetAllTest.variousPdxTypes (#52)
       RegionPutGetAllTest.nullValue (#53)
       RegisterKeysTest.RegisterAllWithCachingRegion (#54)
       RegisterKeysTest.RegisterAllWithConsistencyDisabled (#55)
       RegisterKeysTest.RegisterAnyWithCachingRegion (#56)
       RegisterKeysTest.RegisterAllWithProxyRegion (#57)
       RegisterKeysTest.RegisterAnyWithProxyRegion (#58)
       ServerDisconnect.WithRegionDisconnectedListener (#59)
       SslOneWayTest.PutGetWithValidSslConfiguration (#60)
       SslOneWayTest.PutWithUntrustedTruststore (#61)
       SslOneWayTest.PutWithCorruptTruststore (#62)
       SslOneWayTest.PutWithMissingTruststore (#63)
       SslTwoWayTest.PutGetWithValidSslConfiguration (#64)
       SslTwoWayTest.PutWithInvalidKeystorePassword (#65)
       SslTwoWayTest.PutWithUntrustedKeystore (#66)
       SslTwoWayTest.PutWithCorruptKeystore (#67)
       SslTwoWayTest.PutWithUntrustedTruststore (#68)
       SslTwoWayTest.PutWithCorruptTruststore (#69)
       SslTwoWayTest.PutWithMissingTruststore (#70)
       StructTest.queryResultForRange (#71)
       TransactionCleaningTest.txWithStoppedServer (#72)
       TransactionsTest.ExceptionWhenRollingBackTx (#73)
       TransactionsTest.IlegalStateExceptionNoTx (#74)
       TransactionsTest.ExceptionConflictOnPrepare (#75)
       TransactionsTest.ExceptionConflictOnCommit (#76)
       WanDeserializationTest.testEventsAreDeserializedCorrectly (#77)
   Failed:
       RegisterKeysTest.RegisterAllWithConsistencyDisabled (#55)
   Timed Out:
       None
   Not Run:
       BasicIPv6Test.queryResultForRange (#5)
       ExampleTest.putAndGetWith2Servers (#21)
       ExampleTest.putGetAndUpdateWith1Server (#22)
   Stuck:
       ContainsKeyOnServerExceptionTest.handleException (#78)
   Tests Passed:
       74/76 (97%)
   Tests Failed:
       1/76 (1%)
   Tests Timed Out:
       0/76 (0%)
   Tests Stuck:
       1/76 (1%)
   Tests Not Run:
       3/79 (4%)
   ```
   So it appears the new test for this PR, 
`ContainsKeyOnServerExceptionTest.handleException`, hangs on Windows. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Exception in server not identified correctly in client
> ------------------------------------------------------
>
>                 Key: GEODE-8442
>                 URL: https://issues.apache.org/jira/browse/GEODE-8442
>             Project: Geode
>          Issue Type: Bug
>          Components: native client
>            Reporter: Alberto Bustamante Reyes
>            Assignee: Mario Kevo
>            Priority: Major
>              Labels: needs-review, pull-request-available
>
> Native client is not identifying correctly an exception returned by the 
> server.
> This is a log from an exception properly identified (I have introduced "????" 
> where I think I have to hide sensitive information):
> {code}
> [error 2020/07/22 09:17:10.831337 UTC ????] 
> CacheTransactionManager::failover: An exception 
> (org.apache.geode.cache.TransactionDataNodeHasDepartedException: Could not 
> find transaction host for TXId: 
> ????(default_GeodeDS:1:loner):3:Native_gdbdedfebe1:default_GeodeDS:6022
>       at 
> org.apache.geode.internal.cache.tier.sockets.command.TXFailoverCommand.cmdExecute(TXFailoverCommand.java:126)
>       at 
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177)
>       at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
>       at 
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
>       at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676)
>       at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> ) happened at remote server.
> data-access: write_element:[TransactionDataNodeHasDepartedException] 
> ????.get(????)::apache::geode::client::TransactionDataNodeHasDepartedException:org.apache.geode.cache.TransactionDataNodeHasDepartedException:
>  Could not find transaction host for TXId: 
> ????(default_GeodeDS:1:loner):3:Native_gdbdedfebe1:default_GeodeDS:6022
> {code}
> But in this case, although the exception received is also a 
> TransactionDataNodeHasDepartedException, it is "translated" to an Unknown 
> exception:
> {code}
> [error 2020/07/22 09:17:10.979506 UTC ????] Region::containsKeyOnServer:: An 
> exception (org.apache.geode.cache.TransactionDataNodeHasDepartedException: 
> Transaction data node 192.168.240.13(dms-server-1:1)<v1>:41000 has departed. 
> To proceed, rollback this transaction and begin a new one., caused by 
> org.apache.geode.internal.cache.ForceReattemptException: PartitionResponse 
> got remote CacheClosedException
>       at 
> org.apache.geode.internal.cache.tx.PartitionedTXRegionStub.containsKey(PartitionedTXRegionStub.java:247)
>       at 
> org.apache.geode.internal.cache.TXStateStub.containsKey(TXStateStub.java:431)
>       at 
> org.apache.geode.internal.cache.TXStateProxyImpl.containsKey(TXStateProxyImpl.java:530)
>       at 
> org.apache.geode.internal.cache.PartitionedRegion.containsKey(PartitionedRegion.java:6402)
>       at 
> org.apache.geode.internal.cache.tier.sockets.command.ContainsKey66.cmdExecute(ContainsKey66.java:138)
>       at 
> org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:177)
>       at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
>       at 
> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
>       at 
> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at 
> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676)
>       at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.geode.internal.cache.ForceReattemptException: 
> PartitionResponse got remote CacheClosedException
>       at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage$PartitionResponse.waitForCacheException(PartitionMessage.java:837)
>       at 
> org.apache.geode.internal.cache.partitioned.ContainsKeyValueMessage$ContainsKeyValueResponse.waitForContainsResult(ContainsKeyValueMessage.java:295)
>       at 
> org.apache.geode.internal.cache.PartitionedRegion.containsKeyRemotely(PartitionedRegion.java:6558)
>       at 
> org.apache.geode.internal.cache.tx.PartitionedTXRegionStub.containsKey(PartitionedTXRegionStub.java:231)
>       ... 13 more
> Caused by: org.apache.geode.cache.CacheClosedException: Remote cache is 
> closed: 192.168.240.13(dms-server-1:1)<v1>:41000
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1630)
>       at 
> org.apache.geode.internal.cache.GemFireCacheImpl.getCacheClosedException(GemFireCacheImpl.java:1619)
>       at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:307)
>       at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
>       at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442)
>       at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doPartitionRegionThread(ClusterOperationExecutors.java:422)
>       ... 2 more
> ) happened at remote server.
> [info 2020/07/22 09:17:10.979514 UTC ????] error code: 12
> [fine 2020/07/22 09:17:10.979869 UTC ????] sendSyncRequest:: retry = 0
> [fine 2020/07/22 09:17:10.979874 UTC ????] sendRequestConnWithRetry:: 
> maxSendRetries = 1 
> data-access: write_element:[Exception] 
> SmSessions.containsKeyOnServer(????):apache::geode::client::UnknownException:Replay
>  is unsupported: 
> org.apache.geode.cache.TransactionDataNodeHasDepartedException: Transaction 
> data node 192.168.240.13(????)<v1>:41000 has departed. To proceed, rollback 
> this transaction and begin a new one, caused by 
> org.apache.geode.internal.cache.ForceReattemptException: PartitionResponse 
> got remote CacheClosedException
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to