RE: Question regarding node failure scenario

Roi Apelker Thu, 27 Jul 2017 09:14:34 -0700

Hi, and thanks/Arigato,

1. At first, it did seem that adding the parameter subscription-redundancy="1" 
to the client cache xml was doing the trick.
(I did not understand however what is the meaning of "1" as the value - should 
it be the number of servers here - meaning if there are 2 servers the value 
should be 2?)


2. However, after a few successful trials where the client did reconnect to N2 
and the report was OK, I Had a few unsuccersult trials, where the client did 
not reconnect at all to N1 or N2...                                             
                                                                                
                                                                                
                                                                                
  with the following exception:         

[warning 2017/07/27 15:49:09.844 IDT eaasrt-pwinfo2 <ServerConnection on port 
11715 Thread 7> tid=0xd6] Exception on server while executing function : 
amdocs.imdg.functions.GetPWIn
fo@470a9030
org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.cache.execute.FunctionInvocationTargetException: 
memberDeparted event for < 192.168.2.100(eaasrt-pwinfo1:21117)<v
7>:1032 > crashed =  false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:112)
        at 
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.cmdExecute(ExecuteRegionFunction66.java:245)
        at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:147)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:783)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doOneMessage(ServerConnection.java:913)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1143)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$1$1.run(AcceptorImpl.java:546)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.geode.cache.execute.FunctionInvocationTargetException: 
memberDeparted event for < 192.168.2.100(eaasrt-pwinfo1:21117)<v7>:1032 > 
crashed =  false
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.memberDeparted(PRFunctionStreamingResultCollector.java:394)
        at 
org.apache.geode.distributed.internal.ReplyProcessor21.processActiveMembers(ReplyProcessor21.java:590)
        at 
org.apache.geode.distributed.internal.ReplyProcessor21.preWait(ReplyProcessor21.java:577)
        at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:803)
        at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:789)
        at 
org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:879)
        at 
org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.waitForCacheOrFunctionException(FunctionStreamingResultCollector.java:473)
        at 
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:95)
        ... 9 more


3. Per Akihiro's questions:
node N1 and N2 Do replicate each data based on the same Replicated region in 
the same distributed system? 
A Geode client updates the data.

Thanks

Roi



-----Original Message-----
From: Michael Stolz [mailto:mst...@pivotal.io] 
Sent: Monday, July 24, 2017 9:26 PM
To: dev@geode.apache.org
Subject: Re: Question regarding node failure scenario

Without subscription-redundancy you are running the risk that some of the data 
isn't being pushed to you in the event of a server failure.


--
Mike Stolz
Principal Engineer, GemFire Product Manager
Mobile: +1-631-835-4771

On Sun, Jul 23, 2017 at 10:06 PM, Akihiro Kitada <akit...@pivotal.io> wrote:

> Hello Roi,
>
> I want to confirm actual your configuration.
>
> >- I have a replicated Node, say N1 and its replicated N2 (N2 gets
> activated when N1 is down) and they are configured to send updates via 
> continuous query to my process which then reports on these updates.
>
> Do node N1 and N2 replicate each data based on the same Replicated 
> region configuration in the same distributed system? If not, how do N1 
> and N2 replicated data? Could you attach actual cache configuration 
> (such as
> cache.xml) for N1 and N2?
>
> Who update the data, some specific Geode client application or Geode 
> peer (cache server internally)?
>
> Thanks.
>
>
>
>
> --
> Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736 
> Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am to 
> 5:30pm JST  |  1-877-477-2269
> [image: support] <https://support.pivotal.io/> [image: twitter] 
> <https://twitter.com/pivotal> [image: linkedin] 
> <https://www.linkedin.com/company/3048967> [image: facebook] 
> <https://www.facebook.com/pivotalsoftware> [image: google plus] 
> <https://plus.google.com/+Pivotal> [image: youtube] 
> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv
> 3kl>
>
>
> 2017-07-23 22:04 GMT+09:00 Roi Apelker <roi.apel...@amdocs.com>:
>
> > Hi,  (Bear with me I am a bit new here :))
> >
> > I have the following scenario, I wonder if anyone can comment on it 
> > - is it a known issue, maybe it was solved already in later version, 
> > etc. (I
> am
> > using version 1.0.0)
> > Or maybe you can point me to somewhere in the code.
> > I have posted this question once before (thanks Dan S. for relating 
> > to it), however I was away for 3 weeks therefore posting again.
> >
> > - I have a replicated Node, say N1 and its replicated N2 (N2 gets 
> > activated when N1 is down) and they are configured to send updates 
> > via continuous query to my process which then reports on these updates.
> >
> > - N1 is working all the time and serves as a server, and data is 
> > written to it continuously from external clients. If 1000 events are 
> > written to
> N1,
> > the final report I am referring to will indicate 1000.
> >
> > - When N1 gets killed for any reason, the client connects to N2 
> > which continues to send the continuous query results.
> >
> > - But sometimes, the report is inaccurate, e.g. after running 1000
> events,
> > my report says 950, as if some events of the continuous query do not
> reach
> > the client (the actual data does arrive its destination, just the 
> > report
> is
> > qrong)
> >
> > The fact is, that the report is not accurate, and I only assume that 
> > something is wrong in the update mechanism, somewhere in the area of 
> > continuous query, or somewhere in the replication between the nodes.
> >
> > Right now, the parameter of subscription-redundancy is not configured.
> But
> > subscription-enabled="true" .
> >
> > What is the significance of not configuring the subscription-redundancy ?
> > does it mean that client disconnection may cause continuous query 
> > events
> to
> > be discarded?
> > And if so, is it "may be discarded" or "will be discarded", meaning, 
> > will the result be always the same or not?
> >
> >
> > Thanks,
> >
> > Roi
> > This message and the information contained herein is proprietary and 
> > confidential and subject to the Amdocs policy statement,
> >
> > you may review at https://www.amdocs.com/about/email-disclaimer < 
> > https://www.amdocs.com/about/email-disclaimer>
> >
> >
>
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>

RE: Question regarding node failure scenario

Reply via email to