Re: Question regarding node failure scenario

Akihiro Kitada Thu, 27 Jul 2017 20:13:30 -0700

Hello Roi,

>1. At first, it did seem that adding the parameter
subscription-redundancy="1" to the client cache xml was doing the trick.
>(I did not understand however what is the meaning of "1" as the value -
should it be the number of servers here - meaning if there are
>2 servers the value should be 2?)


It means the number of redundant subscription connections.

If setting subscription-redundancy="1", your client establish the primary
subscription connection to the specific server and the secondary
subscription connection to the other server.

If you have 2 servers, it's O.K. to set subscription-redundancy="1"

>2. However, after a few successful trials where the client did reconnect
to N2 and the report was OK, I Had a few unsuccersult trials,
>where the client did not reconnect at all to N1 or N2...
  :
>org.apache.geode.cache.execute.FunctionException:
org.apache.geode.cache.execute.FunctionInvocationTargetException:
>memberDeparted event for < 192.168.2.100(eaasrt-pwinfo1:21117)<v7>:1032 >
crashed =  false

I'm not sure why your application caught the exception but it caught
FunctionInvocationTargetException in this case. According to the following
API docs, you may re-execute your Function in your application if you catch
this exception.

http://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/execute/FunctionInvocationTargetException.html

>3. Per Akihiro's questions:
>node N1 and N2 Do replicate each data based on the same Replicated region
in the same distributed system
>A Geode client updates the data.

Thank you for your information. Yes, it's as same as my consideration.



-- 
Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736
Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am to
5:30pm JST  |  1-877-477-2269
[image: support] <https://support.pivotal.io/> [image: twitter]
<https://twitter.com/pivotal> [image: linkedin]
<https://www.linkedin.com/company/3048967> [image: facebook]
<https://www.facebook.com/pivotalsoftware> [image: google plus]
<https://plus.google.com/+Pivotal> [image: youtube]
<https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>


2017-07-28 1:13 GMT+09:00 Roi Apelker <roi.apel...@amdocs.com>:

> Hi, and thanks/Arigato,
>
> 1. At first, it did seem that adding the parameter
> subscription-redundancy="1" to the client cache xml was doing the trick.
> (I did not understand however what is the meaning of "1" as the value -
> should it be the number of servers here - meaning if there are 2 servers
> the value should be 2?)
>
> 2. However, after a few successful trials where the client did reconnect
> to N2 and the report was OK, I Had a few unsuccersult trials, where the
> client did not reconnect at all to N1 or N2...
>
>
>
>                              with the following exception:
>
> [warning 2017/07/27 15:49:09.844 IDT eaasrt-pwinfo2 <ServerConnection on
> port 11715 Thread 7> tid=0xd6] Exception on server while executing function
> : amdocs.imdg.functions.GetPWIn
> fo@470a9030
> org.apache.geode.cache.execute.FunctionException: org.apache.geode.cache.
> execute.FunctionInvocationTargetException: memberDeparted event for <
> 192.168.2.100(eaasrt-pwinfo1:21117)<v
> 7>:1032 > crashed =  false
>         at org.apache.geode.internal.cache.partitioned.
> PRFunctionStreamingResultCollector.getResult(
> PRFunctionStreamingResultCollector.java:112)
>         at org.apache.geode.internal.cache.tier.sockets.command.
> ExecuteRegionFunction66.cmdExecute(ExecuteRegionFunction66.java:245)
>         at org.apache.geode.internal.cache.tier.sockets.
> BaseCommand.execute(BaseCommand.java:147)
>         at org.apache.geode.internal.cache.tier.sockets.
> ServerConnection.doNormalMsg(ServerConnection.java:783)
>         at org.apache.geode.internal.cache.tier.sockets.
> ServerConnection.doOneMessage(ServerConnection.java:913)
>         at org.apache.geode.internal.cache.tier.sockets.
> ServerConnection.run(ServerConnection.java:1143)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>         at org.apache.geode.internal.cache.tier.sockets.
> AcceptorImpl$1$1.run(AcceptorImpl.java:546)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.geode.cache.execute.FunctionInvocationTargetException:
> memberDeparted event for < 192.168.2.100(eaasrt-pwinfo1:21117)<v7>:1032 >
> crashed =  false
>         at org.apache.geode.internal.cache.partitioned.
> PRFunctionStreamingResultCollector.memberDeparted(
> PRFunctionStreamingResultCollector.java:394)
>         at org.apache.geode.distributed.internal.ReplyProcessor21.
> processActiveMembers(ReplyProcessor21.java:590)
>         at org.apache.geode.distributed.internal.ReplyProcessor21.
> preWait(ReplyProcessor21.java:577)
>         at org.apache.geode.distributed.internal.ReplyProcessor21.
> waitForRepliesUninterruptibly(ReplyProcessor21.java:803)
>         at org.apache.geode.distributed.internal.ReplyProcessor21.
> waitForRepliesUninterruptibly(ReplyProcessor21.java:789)
>         at org.apache.geode.distributed.internal.ReplyProcessor21.
> waitForRepliesUninterruptibly(ReplyProcessor21.java:879)
>         at org.apache.geode.internal.cache.execute.
> FunctionStreamingResultCollector.waitForCacheOrFunctionException(
> FunctionStreamingResultCollector.java:473)
>         at org.apache.geode.internal.cache.partitioned.
> PRFunctionStreamingResultCollector.getResult(
> PRFunctionStreamingResultCollector.java:95)
>         ... 9 more
>
>
> 3. Per Akihiro's questions:
> node N1 and N2 Do replicate each data based on the same Replicated region
> in the same distributed system?
> A Geode client updates the data.
>
> Thanks
>
> Roi
>
>
>
> -----Original Message-----
> From: Michael Stolz [mailto:mst...@pivotal.io]
> Sent: Monday, July 24, 2017 9:26 PM
> To: dev@geode.apache.org
> Subject: Re: Question regarding node failure scenario
>
> Without subscription-redundancy you are running the risk that some of the
> data isn't being pushed to you in the event of a server failure.
>
>
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: +1-631-835-4771
>
> On Sun, Jul 23, 2017 at 10:06 PM, Akihiro Kitada <akit...@pivotal.io>
> wrote:
>
> > Hello Roi,
> >
> > I want to confirm actual your configuration.
> >
> > >- I have a replicated Node, say N1 and its replicated N2 (N2 gets
> > activated when N1 is down) and they are configured to send updates via
> > continuous query to my process which then reports on these updates.
> >
> > Do node N1 and N2 replicate each data based on the same Replicated
> > region configuration in the same distributed system? If not, how do N1
> > and N2 replicated data? Could you attach actual cache configuration
> > (such as
> > cache.xml) for N1 and N2?
> >
> > Who update the data, some specific Geode client application or Geode
> > peer (cache server internally)?
> >
> > Thanks.
> >
> >
> >
> >
> > --
> > Akihiro Kitada  |  Staff Customer Engineer |  +81 80 3716 3736
> > Support.Pivotal.io <http://support.pivotal.io/>  |  Mon-Fri  9:00am to
> > 5:30pm JST  |  1-877-477-2269
> > [image: support] <https://support.pivotal.io/> [image: twitter]
> > <https://twitter.com/pivotal> [image: linkedin]
> > <https://www.linkedin.com/company/3048967> [image: facebook]
> > <https://www.facebook.com/pivotalsoftware> [image: google plus]
> > <https://plus.google.com/+Pivotal> [image: youtube]
> > <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv
> > 3kl>
> >
> >
> > 2017-07-23 22:04 GMT+09:00 Roi Apelker <roi.apel...@amdocs.com>:
> >
> > > Hi,  (Bear with me I am a bit new here :))
> > >
> > > I have the following scenario, I wonder if anyone can comment on it
> > > - is it a known issue, maybe it was solved already in later version,
> > > etc. (I
> > am
> > > using version 1.0.0)
> > > Or maybe you can point me to somewhere in the code.
> > > I have posted this question once before (thanks Dan S. for relating
> > > to it), however I was away for 3 weeks therefore posting again.
> > >
> > > - I have a replicated Node, say N1 and its replicated N2 (N2 gets
> > > activated when N1 is down) and they are configured to send updates
> > > via continuous query to my process which then reports on these updates.
> > >
> > > - N1 is working all the time and serves as a server, and data is
> > > written to it continuously from external clients. If 1000 events are
> > > written to
> > N1,
> > > the final report I am referring to will indicate 1000.
> > >
> > > - When N1 gets killed for any reason, the client connects to N2
> > > which continues to send the continuous query results.
> > >
> > > - But sometimes, the report is inaccurate, e.g. after running 1000
> > events,
> > > my report says 950, as if some events of the continuous query do not
> > reach
> > > the client (the actual data does arrive its destination, just the
> > > report
> > is
> > > qrong)
> > >
> > > The fact is, that the report is not accurate, and I only assume that
> > > something is wrong in the update mechanism, somewhere in the area of
> > > continuous query, or somewhere in the replication between the nodes.
> > >
> > > Right now, the parameter of subscription-redundancy is not configured.
> > But
> > > subscription-enabled="true" .
> > >
> > > What is the significance of not configuring the
> subscription-redundancy ?
> > > does it mean that client disconnection may cause continuous query
> > > events
> > to
> > > be discarded?
> > > And if so, is it "may be discarded" or "will be discarded", meaning,
> > > will the result be always the same or not?
> > >
> > >
> > > Thanks,
> > >
> > > Roi
> > > This message and the information contained herein is proprietary and
> > > confidential and subject to the Amdocs policy statement,
> > >
> > > you may review at https://www.amdocs.com/about/email-disclaimer <
> > > https://www.amdocs.com/about/email-disclaimer>
> > >
> > >
> >
> This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement,
>
> you may review at https://www.amdocs.com/about/email-disclaimer <
> https://www.amdocs.com/about/email-disclaimer>
>

Re: Question regarding node failure scenario

Reply via email to