[ 
https://issues.apache.org/jira/browse/GEODE-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans resolved GEODE-9310.
--------------------------------
    Resolution: Invalid

With the removal of function execution in GEODE-9168, this issue is no longer 
valid, as the code described has been removed.

> Geode redis should send error message when encounter 
> DistributedSystemDisconnectedException, ForcedDisconnectException or 
> FunctionInvocationTargetException
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-9310
>                 URL: https://issues.apache.org/jira/browse/GEODE-9310
>             Project: Geode
>          Issue Type: Bug
>          Components: redis
>            Reporter: Eric Shu
>            Priority: Major
>
> Currently when redis function execution encounter these exception, it will 
> just close the client connection. 
> {code:java}
> else if (cause instanceof FunctionInvocationTargetException
>         || cause instanceof DistributedSystemDisconnectedException
>         || cause instanceof ForcedDisconnectException) {
>       // This indicates a member departed or got disconnected
>       logger.warn(
>           "Closing client connection because one of the servers doing this 
> operation departed.");
>       channelInactive(ctx);
>       response = null;
>     }
> {code}
> This is incorrect, as this could lead to client performing retries of the 
> same redis command.
> The correct way is to send an error response that command may or may not have 
> been executed on servers, user need to verify and retry if necessary.
> Here is a test result showing that current behavior of client retrying:
> The node executing the ZREM command:
> (vm1 gets the connection from JedisCluster client)
> [vm1] [warn 2021/05/25 15:31:26.732 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Executing Redis command: ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 
> (primary is on vm3 and it is execute the function)
> [vm3] [warn 2021/05/25 15:31:26.692 PDT server-3 <Function Execution 
> Processor3> tid=0xf3] bucket region is 
> BucketRegion[path='/__PR/_B__REDIS__DATA_123;serial=399;primary=true] key key
> [vm3] java.lang.Exception
> [vm3]         at 
> org.apache.geode.internal.cache.BucketRegion.virtualPut(BucketRegion.java:533)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5586)
> [vm3]         at 
> org.apache.geode.internal.cache.PartitionedRegionDataStore.putLocally(PartitionedRegionDataStore.java:1213)
> [vm3]         at 
> org.apache.geode.internal.cache.PartitionedRegion.putInBucket(PartitionedRegion.java:3024)
> [vm3]         at 
> org.apache.geode.internal.cache.PartitionedRegion.virtualPut(PartitionedRegion.java:2236)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5586)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:157)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5044)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1645)
> [vm3]         at 
> org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1632)
> [vm3]         at 
> org.apache.geode.internal.cache.AbstractRegion.put(AbstractRegion.java:445)
> [vm3]         at 
> org.apache.geode.redis.internal.data.AbstractRedisData.storeChanges(AbstractRedisData.java:218)
> [vm3]         at 
> org.apache.geode.redis.internal.data.RedisSortedSet.zrem(RedisSortedSet.java:239)
> [vm3]         at 
> org.apache.geode.redis.internal.data.RedisSortedSetCommandsFunctionExecutor.lambda$zrem$2(RedisSortedSetCommandsFunctionExecutor.java:50)
> [vm3]         at 
> org.apache.geode.redis.internal.executor.SynchronizedStripedExecutor.execute(SynchronizedStripedExecutor.java:47)
> [vm3]         at 
> org.apache.geode.redis.internal.data.RedisDataCommandsFunctionExecutor.stripedExecute(RedisDataCommandsFunctionExecutor.java:43)
> [vm3]         at 
> org.apache.geode.redis.internal.data.RedisSortedSetCommandsFunctionExecutor.zrem(RedisSortedSetCommandsFunctionExecutor.java:49)
> [vm3]         at 
> org.apache.geode.redis.internal.executor.CommandFunction.compute(CommandFunction.java:289)
> [vm3]         at 
> org.apache.geode.redis.internal.executor.SingleResultRedisFunction.lambda$execute$0(SingleResultRedisFunction.java:47)
> [vm3]         at 
> org.apache.geode.internal.cache.PartitionedRegion.computeWithPrimaryLocked(PartitionedRegion.java:629)
> [vm3]         at 
> org.apache.geode.redis.internal.executor.SingleResultRedisFunction.execute(SingleResultRedisFunction.java:51)
> [vm3]         at 
> org.apache.geode.internal.cache.PartitionedRegionDataStore.executeOnDataStore(PartitionedRegionDataStore.java:2988)
> [vm3]         at 
> org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.operateOnPartitionedRegion(PartitionedRegionFunctionStreamingMessage.java:101)
> [vm3]         at 
> org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:333)
> [vm3]         at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
> [vm3]         at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
> The operation has been distributed to the bucket with redundant copy.
> [vm1] [warn 2021/05/25 15:31:26.694 PDT server-1 <P2P message reader for 
> 192.168.0.14(server-3:79774)<v3>:41003 shared ordered sender uid=19 local 
> port=54410 remote port=51127> tid=0xf8] bucket region is 
> BucketRegion[path='/__PR/_B__REDIS__DATA_123;serial=399;primary=false] key key
> [vm1] java.lang.Exception
> [vm1]         at 
> org.apache.geode.internal.cache.BucketRegion.virtualPut(BucketRegion.java:533)
> [vm1]         at 
> org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:171)
> [vm1]         at 
> org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5579)
> [vm1]         at 
> org.apache.geode.internal.cache.AbstractUpdateOperation.doPutOrCreate(AbstractUpdateOperation.java:156)
> [vm1]         at 
> org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.basicOperateOnRegion(AbstractUpdateOperation.java:307)
> [vm1]         at 
> org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.operateOnRegion(AbstractUpdateOperation.java:278)
> [vm1]         at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1213)
> [vm1]         at 
> org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1115)
> [vm1]         at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
> [vm1]         at 
> org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:432)
> [vm1]         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:2066)
> [vm1]         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:1831)
> [vm1]         at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.dispatchMessage(GMSMembership.java:847)
> [vm1]         at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.handleOrDeferMessage(GMSMembership.java:804)
> [vm1]         at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.processMessage(GMSMembership.java:1098)
> [vm1]         at 
> org.apache.geode.distributed.internal.DistributionImpl$MyDCReceiver.messageReceived(DistributionImpl.java:828)
> [vm1]         at 
> org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:622)
> [vm1]         at 
> org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:682)
> [vm1]         at 
> org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3271)
> [vm1]         at 
> org.apache.geode.internal.tcp.Connection.readMessage(Connection.java:3022)
> [vm1]         at 
> org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2809)
> [vm1]         at 
> org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1647)
> [vm1]         at 
> org.apache.geode.internal.tcp.Connection.run(Connection.java:1487)
> [vm1]         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [vm1]         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [vm1]         at java.lang.Thread.run(Thread.java:748)
> [vm1] [warn 2021/05/25 15:31:26.714 PDT server-1 <P2P message reader for 
> 192.168.0.14(server-3:79774)<v3>:41003 shared ordered sender uid=19 local 
> port=54410 remote port=51127> tid=0xf8] membersRemoveAll invoked
> [vm1] java.lang.Exception
> [vm1]         at 
> org.apache.geode.redis.internal.data.RedisSortedSet.membersRemoveAll(RedisSortedSet.java:175)
> [vm1]         at 
> org.apache.geode.redis.internal.data.RedisSortedSet.applyDelta(RedisSortedSet.java:89)
> [vm1]         at 
> org.apache.geode.redis.internal.data.AbstractRedisData.fromDelta(AbstractRedisData.java:193)
> [vm1]         at 
> org.apache.geode.internal.cache.EntryEventImpl.processDeltaBytes(EntryEventImpl.java:1841)
> [vm1]         at 
> org.apache.geode.internal.cache.EntryEventImpl.setNewValueInRegion(EntryEventImpl.java:1696)
> [vm1]         at 
> org.apache.geode.internal.cache.EntryEventImpl.putExistingEntry(EntryEventImpl.java:1643)
> [vm1]         at 
> org.apache.geode.internal.cache.map.RegionMapPut.updateEntry(RegionMapPut.java:485)
> (vm3 is shutdown)
> [vm3] [info 2021/05/25 15:31:26.704 PDT server-3 <RMI TCP 
> Connection(1)-192.168.0.14> tid=0x14] GemFireCache[id = 2015401929; isClosing 
> = true; isShutDownAll = false; created = Tue May 25 15:31:24 PDT 2021; server 
> = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing.
> [vm3] [info 2021/05/25 15:31:26.706 PDT server-3 <RMI TCP 
> Connection(1)-192.168.0.14> tid=0x14] GeodeRedisServer shutting down
> vm1 reports command failed (even though it succeeded and was distributed to 
> replicas)
> [vm1] [warn 2021/05/25 15:31:26.728 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Execution of Redis command ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 
> failed: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.cache.execute.FunctionInvocationTargetException: The cache 
> is closed.
> [vm1] [warn 2021/05/25 15:31:26.729 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Closing client connection because one of the servers doing this 
> operation departed.
> This leads to JedisCluster client to retry with the same command:
> [vm1] [warn 2021/05/25 15:31:26.732 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Executing Redis command: ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 
> [vm1] [warn 2021/05/25 15:31:26.729 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Closing client connection because one of the servers doing this 
> operation departed.
> [vm1] [warn 2021/05/25 15:31:26.732 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Executing Redis command: ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 
> ...
> [vm1] [warn 2021/05/25 15:31:26.737 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Executing Redis command: ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 
> ...
> [vm1] [warn 2021/05/25 15:31:26.739 PDT server-1 <GeodeRedisServer-Command-1> 
> tid=0xfb] Execution of Redis command ZREM key member1-212 member1-211 
> member1-214 member1-213 member1-210 member1-209 member1-208 member1-205 
> member1-204 failed: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.cache.execute.FunctionInvocationTargetException: Remote 
> cache is closed: 192.168.0.14(server-3:79774)<v3>:41003
> ...
> Currently the retries aborted due to the following issue by JedisCluster 
> client (which is tracked by redis issue of not back off and retry on 
> different node: https://github.com/redis/jedis/issues/2347)
> java.util.concurrent.ExecutionException: 
> redis.clients.jedis.exceptions.JedisClusterMaxAttemptsException: No more 
> cluster attempts left.
>       at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>       at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>       at 
> org.apache.geode.redis.internal.executor.sortedset.ZRemDUnitTest.zRemCanRemoveMembersFromSortedSetDuringPrimaryIsCrashed(ZRemDUnitTest.java:177)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.apache.geode.test.junit.rules.serializable.SerializableExternalResource$1.evaluate(SerializableExternalResource.java:38)
>       at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>       at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>       at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at 
> org.apache.geode.test.dunit.rules.ClusterStartupRule$1.evaluate(ClusterStartupRule.java:138)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>       at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:54)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: redis.clients.jedis.exceptions.JedisClusterMaxAttemptsException: 
> No more cluster attempts left.
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:84)
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
>       at 
> redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:122)
>       at 
> redis.clients.jedis.JedisClusterCommand.run(JedisClusterCommand.java:25)
>       at redis.clients.jedis.JedisCluster.zrem(JedisCluster.java:984)
>       at 
> org.apache.geode.redis.internal.executor.sortedset.ZRemDUnitTest.doZRem(ZRemDUnitTest.java:157)
>       at 
> org.apache.geode.redis.internal.executor.sortedset.ZRemDUnitTest.lambda$zRemCanRemoveMembersFromSortedSetDuringPrimaryIsCrashed$0(ZRemDUnitTest.java:174)
>       at 
> org.apache.geode.test.junit.rules.ExecutorServiceRule.lambda$submit$2(ExecutorServiceRule.java:232)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Once this is solved by JedisCluter client, the retry may cause wrong number 
> being reported by the retried command.
> It is possible that this behavior may impact on APPEND and INCRBY command -- 
> which would produce wrong results of the retried command (subject to 
> additional testing).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to