Thanks, The subscription-redundancy is set to "1" and the region is used in 2 nodes (there are more nodes which re not related to it).
Yes there is an exception, which I am yet to understand: (and this exception causes the closure of the CQ in this node as well as sending operation message to the other node to close!) caught exception while running: java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at sun.nio.ch.IOUtil.write(IOUtil.java:51) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) at org.apache.geode.internal.cache.tier.sockets.Message.flushBuffer(Message.java:651) at org.apache.geode.internal.cache.tier.sockets.Message.sendBytes(Message.java:632) at org.apache.geode.internal.cache.tier.sockets.ChunkedMessage.sendChunk(ChunkedMessage.java:314) at org.apache.geode.internal.cache.tier.sockets.ChunkedMessage.sendChunk(ChunkedMessage.java:322) at org.apache.geode.internal.cache.tier.sockets.BaseCommand.writeQueryResponseChunk(BaseCommand.java:756) at org.apache.geode.internal.cache.tier.sockets.BaseCommandQuery.processQueryUsingParams(BaseCommandQuery.java:225) at org.apache.geode.internal.cache.tier.sockets.BaseCommandQuery.processQuery(BaseCommandQuery.java:70) at org.apache.geode.internal.cache.tier.sockets.command.ExecuteCQ61.cmdExecute(ExecuteCQ61.java:179) at org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:147) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:783) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.doOneMessage(ServerConnection.java:913) at org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1143) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$1$1.run(AcceptorImpl.java:546) at java.lang.Thread.run(Thread.java:745) could it be the client disconnected from the node right after sending this message? (The client itself continues to run normally...) The scenario is that after all nodes are initialized, I am stopping one server out of 2. Sometimes, 1 out of 5 - the CQ stops notifying the client after this stop. Most if the time the CQ continues to run fine. I am certain this is related to some timing issue, some registration which fails, something also related to the filter profiles which is held in the region... Thanks Roi -----Original Message----- From: Anilkumar Gingade [mailto:aging...@pivotal.io] Sent: Wednesday, August 16, 2017 1:41 AM To: dev@geode.apache.org Subject: Re: continuous query internal mechanism questions In Geode, high availability for subscription events are achieved by having redundant event-queues (HAQueues) on multiple severs; this is configured using redundancy-level with client connection. Based on the redundancy level, the client register CQs on multiple servers. During the subscription (CQ) registration, it elects/assigns one of the server to host primary HAQueue. The client keeps monitoring the redundancy level during node join or failure; to satisfy the redundancy level. You can find more about HAQueues at https://cwiki.apache.org/confluence/display/GEODE/HA+Client+Event+Queues I assume, you have 2 node cluster. What is your subscription redundancy level? >> For some reason, sometimes there is a failure to complete the first registration Is there any log message, stack trace, reporting reason for failure? If its dev environment, you can run client/server with debug/fine level log to see additional info. Are you trying to stop your server, while registering the CQs? Can you give more detail about your test scenario... -Anil. On Tue, Aug 15, 2017 at 11:25 AM, Jason Huynh <jasonhu...@apache.org> wrote: > I am not quite sure how native client registers cqs. From my understanding: > with the java api, I believe there is only one message (ExecuteCQ > message) that is executed on the server side and then replicated to > the other nodes through the profile (OperationMessage). > > It seems the extra ExecuteCQ message failing and then closing the cq > might be putting the system in a weird state... > > On Tue, Aug 15, 2017 at 7:56 AM Roi Apelker <roi.apel...@amdocs.com> > wrote: > > > Hi, > > > > I have been examining the continuous query registration mechanism > > for quite some time This is related to an issue that I have, where > > sometimes a node crashes > (1 > > node out of 2), and the other one does not send CQ events. The CQ is > > registered on a partitioned region which resides on these 2 nodes. > > > > I noticed the following behavior, and I wonder if anyone can comment > > regarding it, if it is justified or not and what is the reason: > > > > 1. When the software using the client (native client) registers for > > the CQ, a CQ command (ExecuteCQ61) is received on both servers. > > -- is this normal behaviour? Does the client actually send this > > command to both servers? > > > > 2. When this command is received by a server, and the CQ is > > registered, another registration message is sent to the other node > > via an OperationMessage (REGISTER_CQ) > > -- it seems that regularly, the server can handle this situation as > > the second registration identifies the previous one and does not affect it. > but > > the question, why do we need this 2nd registration, if there is a > > command sent to each server? > > > > 3. For some reason, sometimes there is a failure to complete the > > first registration (executed by ExecuteCQ61) and then this failure > > causes a closure to the CQ, which is accompanied with a close > > request to the other node. > > -- I assume by now, since 2 registrations and one closure have > > occurred on node 2, the CQ is still active and the client receives > > notifications. > > > > 4. Sometimes, 1 out of 5, once node 1 crashes, I get a cleanup > > operation, caused by the crash (via MemberCrashedEvent), and this > > also closes the existing CQ, and in this case the CQ in node 2 does > > not operate anymore > and > > the client receives no notifications. > > -- fact is, that 4 out of 4 times, I do not get this cleanup by > > MemberCrashedEvent (maybe due to some other error), and that the CQ > > notifications are received normally. > > > > Can anyone clear things up for me? Any comment on any of the > > statements above will be greatly appreciated. > > > > Thanks, > > > > Roi > > > > > > -----Original Message----- > > From: Roi Apelker > > Sent: Wednesday, August 09, 2017 3:21 PM > > To: dev@geode.apache.org > > Subject: RE: continuous query internal mechanism > > > > Dhanyavad > > > > -----Original Message----- > > From: Anilkumar Gingade [mailto:aging...@pivotal.io] > > Sent: Tuesday, August 08, 2017 9:55 PM > > To: dev@geode.apache.org > > Subject: Re: continuous query internal mechanism > > > > Registered events, i meant, are events generated for interest > registration > > "region.registerInterest(*)". And CqEvents are for CQs registered. > > > > -Anil. > > > > > > On Tue, Aug 8, 2017 at 12:27 AM, Roi Apelker > > <roi.apel...@amdocs.com> > > wrote: > > > > > Shukriya > > > > > > What is the difference between registered events and CQ events? > > > > > > -----Original Message----- > > > From: Anilkumar Gingade [mailto:aging...@pivotal.io] > > > Sent: Monday, August 07, 2017 10:12 PM > > > To: dev@geode.apache.org > > > Subject: Re: continuous query internal mechanism > > > > > > CQ Processing on server side is same for all clients (Java, C++)... > > > > > > The subscription events are sent to client as ClientUpdateMessage, > > > which holds information about registered events and CQ events. The > > > client process this and updates/invokes the client side > > > cache/listeners with respective event. Look into > > > ClientUpdateMessageImpl and CacheClientUpdater (for client side > > processing). > > > > > > -Anil. > > > > > > > > > > > > > > > On Mon, Aug 7, 2017 at 11:01 AM, Roi Apelker > > > <roi.apel...@amdocs.com> > > > wrote: > > > > > > > Thanks, > > > > > > > > By the way, is there any difference in the behaviour of the > > > > server, if the client that registered the CQ is a native (C++) client? > > > > > > > > I have been going over the classes and code for some time and > > > > can't seem to find the actual location where a CQ > > > > update/notification is > > > sent... > > > > > > > > It's like CqEventImpl class is never even generated in this scenario. > > > > > > > > If anyone can help here I would be most grateful :-) > > > > > > > > Thanks > > > > > > > > Roi > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: Anilkumar Gingade [mailto:aging...@pivotal.io] > > > > Sent: Monday, August 07, 2017 8:23 PM > > > > To: dev@geode.apache.org > > > > Subject: Re: continuous query internal mechanism > > > > > > > > You can find those in CqServiceImpl.process*()... > > > > > > > > -Anil. > > > > > > > > > > > > On Mon, Aug 7, 2017 at 9:14 AM, Roi Apelker > > > > <roi.apel...@amdocs.com> > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > I am trying to look into the code of the continuous query > > > > > mechanism > > > > > - where the GEODE server sends the notification back to the client. > > > > > > > > > > Can anyone point me to the central classes of continuous > > > > > query, especially to the one that is responsible for the > > > > > calculation of the new data and packing it as a message back to the > > > > > client? > > > > > > > > > > Thanks, > > > > > > > > > > Roi > > > > > > > > > > This message and the information contained herein is > > > > > proprietary and confidential and subject to the Amdocs policy > > > > > statement, > > > > > > > > > > you may review at > > > > > https://www.amdocs.com/about/email-disclaimer < > > > > > https://www.amdocs.com/about/email-disclaimer> > > > > > > > > > This message and the information contained herein is proprietary > > > > and confidential and subject to the Amdocs policy statement, > > > > > > > > you may review at https://www.amdocs.com/about/email-disclaimer > > > > < https://www.amdocs.com/about/email-disclaimer> > > > > > > > This message and the information contained herein is proprietary > > > and confidential and subject to the Amdocs policy statement, > > > > > > you may review at https://www.amdocs.com/about/email-disclaimer < > > > https://www.amdocs.com/about/email-disclaimer> > > > > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, > > > > you may review at https://www.amdocs.com/about/email-disclaimer < > > https://www.amdocs.com/about/email-disclaimer> > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, > > > > you may review at https://www.amdocs.com/about/email-disclaimer < > > https://www.amdocs.com/about/email-disclaimer> > > > This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at https://www.amdocs.com/about/email-disclaimer <https://www.amdocs.com/about/email-disclaimer>