Re: [DISCUSS] Redundancy Gfsh Commands

2020-04-02 Thread Jacob Barrett


> On Apr 1, 2020, at 10:46 AM, Donal Evans  wrote:
> 
> There's a subtlety with the second no-op case though, since you could have
> a situation where you call the command with no arguments (include all
> regions) and don't find any partitioned regions, which would be fine

I think in this case it is not an error since all would mean all that this may 
apply to.

> or you could have a situation where you explicitly include some regions and
> none of them are found, in which case I'm not sure that returning success
> would be correct. Would it be reasonable to return error in the case that
> all explicitly included region aren't found?

I believe this should be the behavior. If any one explicitly listed region does 
not exist an error should result.

-Jake



Re: [DISCUSS] Redundancy Gfsh Commands

2020-04-02 Thread Aaron Lindsey
> Would it be reasonable to return error in the case that
> all explicitly included region aren't found?

Yes, this sounds reasonable. Thanks for pointing out that subtlety and for 
updating the RFC.

From the RFC:
> The command will return error status if:

I assume this means ERROR or FAILURE (non-success) status. It seems a little 
confusing that there are both ERROR and FAILURE statuses. Maybe you could add 
another section, “the command will return failure status if:” to make it clear 
when it will be returning an error vs a failure. I think this will be important 
for users who are triggering the redundancy restore operation programmatically 
to understand the difference.

> On Apr 2, 2020, at 5:18 AM, Jacob Barrett  wrote:
> 
> 
> 
>> On Apr 1, 2020, at 10:46 AM, Donal Evans  wrote:
>> 
>> There's a subtlety with the second no-op case though, since you could have
>> a situation where you call the command with no arguments (include all
>> regions) and don't find any partitioned regions, which would be fine
> 
> I think in this case it is not an error since all would mean all that this 
> may apply to.
> 
>> or you could have a situation where you explicitly include some regions and
>> none of them are found, in which case I'm not sure that returning success
>> would be correct. Would it be reasonable to return error in the case that
>> all explicitly included region aren't found?
> 
> I believe this should be the behavior. If any one explicitly listed region 
> does not exist an error should result.
> 
> -Jake
> 



In progress RFC's?

2020-04-02 Thread Anthony Baker
I was reviewing the list of RFC’s stil under discussion and noticed that the 
following may need to be moved to a different status:

Classloader Isolation [1] - Udo
Logging to standard out [2] - Jake
Replace singleton PoolManager with ClientCache scoped service [3] - Dan
Certificate based authorization [4] - Mario


Could the authors check the status and either extend the discussion date or 
move to the correct status?  Thanks!

Anthony


[1] https://cwiki.apache.org/confluence/display/GEODE/ClassLoader+Isolation
[2] https://cwiki.apache.org/confluence/display/GEODE/Logging+to+Standard+Out
[3] 
https://cwiki.apache.org/confluence/display/GEODE/Replace+singleton+PoolManager+with+ClientCache+scoped+service
[4] 
https://cwiki.apache.org/confluence/display/GEODE/Certificate+Based+Authorization



Re: RFC - Gateway sender to deliver transaction events atomically to receivers

2020-04-02 Thread Alberto Gomez
Hi,

The

Yesterday was the end date for comments for this RFC.

I tried to answer the questions that were sent and also address the concerns 
about the proposal.

The main concern was related to the reordering of events that could happen in 
the gateway sender in order to group events of the same transaction in the same 
batch. My conclusion was that even if some reordering could happen, that would 
not mean that it was incorrect, given that it would be for events really close 
in time and also because there can already be some reordering of events between 
the time they are generated until they reach the sender's queue.

There was also a concern about adding a new field to each EntryEvent which 
would increase the over the wire format for everyone. The need for a new 
attribute in EntryEvent has been removed and in the new version of the proposal 
it is only needed to add the isLastTransactionEvent to the GatewaySenderEvent 
class.

Udo also showed some other and more general concerns which I do not know if 
have been resolved.

I would appreciate some more feedback so that I go for the pull request - if it 
is positive, or we keep the discussion alive.

Thanks in advance,

Alberto G.



From: Barry Oglesby 
Sent: Thursday, March 26, 2020 7:34 PM
To: dev@geode.apache.org 
Subject: Re: RFC - Gateway sender to deliver transaction events atomically to 
receivers

I added some comments to the proposal. There a few concerns, but I like the
idea in general.

Dan said: I remember someone trying to accomplish this same thing on top of
geode
with TransactionListener that dumped into a separate region or something
like that.

I think both Charlie and I have implemented this idea a few times,

Here is the basic idea:

The data region defines a TransactionListener with an afterCommit that:

- creates a UnitOfWork object
- creates an Event for each CacheEvent in the TransactionEvent event that
contains:
  - regionName
  - operation
  - key
  - value
  - potentially other things like EventID, VersionTag, TXId
- puts the UnitOfWork into a transaction region that has a gateway sender
attached to it. It also has a CacheWriter attached to it.

On the remote site, the CacheWriter attached to the transaction region:

- begins a transaction
- iterates the UnitOfWork's Events and executes each one
- commits the transaction

There are definitely some caveats to this:

- There is a race condition between the commit in the data region and the
TransactionListener afterCommit invocation doing the put into the
transaction region. If the server crashes after the put into the data
region but before the afterCommit callback, there will be data loss. In
that case, the transaction in question will not have been stored in the
transaction region and not be sent to the remote site.
- Ideally, the data and transaction regions should be colocated, but that
is a tricky.
- What happens if a transaction fails in the remote site?
- The transaction region has to be cleared periodically.
- Knowing when to process the transaction in the CacheWriter is a bit
tricky. It only needs to happen for transactions that originated remotely.
Adding distributed system id to the UnitOfWork is one way to address this.

Thanks,
Barry Oglesby



On Thu, Mar 26, 2020 at 7:34 AM Jacob Barrett  wrote:

> Great idea. I called out some similar areas of concerns and spit balled
> some solutions to get the conversations flowing.
>
> -Jake
>
>
> > On Mar 25, 2020, at 8:04 AM, Alberto Gomez 
> wrote:
> >
> > Hi,
> >
> > Could you please review the RFC for "Gateway sender to deliver
> transaction events atomically to receivers"?
> >
> >
> https://cwiki.apache.org/confluence/display/GEODE/Gw+sender+to+deliver+transaction+events+atomically+to+receivers
> >
> > Deadline for comments is Wednesday, April 1st, 2020,
> >
> > Thanks,
> >
> > Alberto G.
>
>


Re: [DISCUSS] Redundancy Gfsh Commands

2020-04-02 Thread Donal Evans
Re-sending this from the correct email address. I think the original got
eaten.


> From the RFC:
> > The command will return error status if:
> I assume this means ERROR or FAILURE (non-success) status. It seems a
> little confusing that there are both ERROR and FAILURE statuses. Maybe you
> could add another section, “the command will return failure status if:” to
> make it clear when it will be returning an error vs a failure. I think this
> will be important for users who are triggering the redundancy restore
> operation programmatically to understand the difference.


This is a little confusing, unfortunately. The issue arises from the fact
that in addition to the Status of the RestoreRedundancyResults, which can
be SUCCESS, FAILURE or ERROR and describes the status of a restore
redundancy operation on one member, a gfsh command can only return either
SUCCESS or ERROR result statuses when it is called, which represent the
status of the command across all members. The result status of the command
and the Status of the RestoreRedundancyResults are separate things,
although the command's result status will be partially determined from the
Status values of the RestoreRedundancyResults objects returned by function
execution on each member.

The differentiation between FAILURE and ERROR status for
RestoreRedundancyResults is relevant, since there should be different
behaviour for "we were unable to create a redundant copy for all buckets in
a region," which may be due to not enough members hosting that region or
some other relatively benign problem, and "we encountered an exception
while attempting to restore redundancy and were unable to complete the
operation," which could be indicative of a more serious issue.

I hope this clarifies things somewhat.

On Thu, Apr 2, 2020 at 8:44 AM Aaron Lindsey 
wrote:

> > Would it be reasonable to return error in the case that
> > all explicitly included region aren't found?
>
> Yes, this sounds reasonable. Thanks for pointing out that subtlety and for
> updating the RFC.
>
> From the RFC:
> > The command will return error status if:
>
> I assume this means ERROR or FAILURE (non-success) status. It seems a
> little confusing that there are both ERROR and FAILURE statuses. Maybe you
> could add another section, “the command will return failure status if:” to
> make it clear when it will be returning an error vs a failure. I think this
> will be important for users who are triggering the redundancy restore
> operation programmatically to understand the difference.
>
> > On Apr 2, 2020, at 5:18 AM, Jacob Barrett  wrote:
> >
> >
> >
> >> On Apr 1, 2020, at 10:46 AM, Donal Evans  wrote:
> >>
> >> There's a subtlety with the second no-op case though, since you could
> have
> >> a situation where you call the command with no arguments (include all
> >> regions) and don't find any partitioned regions, which would be fine
> >
> > I think in this case it is not an error since all would mean all that
> this may apply to.
> >
> >> or you could have a situation where you explicitly include some regions
> and
> >> none of them are found, in which case I'm not sure that returning
> success
> >> would be correct. Would it be reasonable to return error in the case
> that
> >> all explicitly included region aren't found?
> >
> > I believe this should be the behavior. If any one explicitly listed
> region does not exist an error should result.
> >
> > -Jake
> >
>
>


Re: [DISCUSS] Redundancy Gfsh Commands

2020-04-02 Thread Aaron Lindsey
Yes, thanks for clarifying.

> On Apr 2, 2020, at 10:12 AM, Donal Evans  wrote:
> 
> Re-sending this from the correct email address. I think the original got
> eaten.
> 
> 
>> From the RFC:
>>> The command will return error status if:
>> I assume this means ERROR or FAILURE (non-success) status. It seems a
>> little confusing that there are both ERROR and FAILURE statuses. Maybe you
>> could add another section, “the command will return failure status if:” to
>> make it clear when it will be returning an error vs a failure. I think this
>> will be important for users who are triggering the redundancy restore
>> operation programmatically to understand the difference.
> 
> 
> This is a little confusing, unfortunately. The issue arises from the fact
> that in addition to the Status of the RestoreRedundancyResults, which can
> be SUCCESS, FAILURE or ERROR and describes the status of a restore
> redundancy operation on one member, a gfsh command can only return either
> SUCCESS or ERROR result statuses when it is called, which represent the
> status of the command across all members. The result status of the command
> and the Status of the RestoreRedundancyResults are separate things,
> although the command's result status will be partially determined from the
> Status values of the RestoreRedundancyResults objects returned by function
> execution on each member.
> 
> The differentiation between FAILURE and ERROR status for
> RestoreRedundancyResults is relevant, since there should be different
> behaviour for "we were unable to create a redundant copy for all buckets in
> a region," which may be due to not enough members hosting that region or
> some other relatively benign problem, and "we encountered an exception
> while attempting to restore redundancy and were unable to complete the
> operation," which could be indicative of a more serious issue.
> 
> I hope this clarifies things somewhat.
> 
> On Thu, Apr 2, 2020 at 8:44 AM Aaron Lindsey 
> wrote:
> 
>>> Would it be reasonable to return error in the case that
>>> all explicitly included region aren't found?
>> 
>> Yes, this sounds reasonable. Thanks for pointing out that subtlety and for
>> updating the RFC.
>> 
>> From the RFC:
>>> The command will return error status if:
>> 
>> I assume this means ERROR or FAILURE (non-success) status. It seems a
>> little confusing that there are both ERROR and FAILURE statuses. Maybe you
>> could add another section, “the command will return failure status if:” to
>> make it clear when it will be returning an error vs a failure. I think this
>> will be important for users who are triggering the redundancy restore
>> operation programmatically to understand the difference.
>> 
>>> On Apr 2, 2020, at 5:18 AM, Jacob Barrett  wrote:
>>> 
>>> 
>>> 
 On Apr 1, 2020, at 10:46 AM, Donal Evans  wrote:
 
 There's a subtlety with the second no-op case though, since you could
>> have
 a situation where you call the command with no arguments (include all
 regions) and don't find any partitioned regions, which would be fine
>>> 
>>> I think in this case it is not an error since all would mean all that
>> this may apply to.
>>> 
 or you could have a situation where you explicitly include some regions
>> and
 none of them are found, in which case I'm not sure that returning
>> success
 would be correct. Would it be reasonable to return error in the case
>> that
 all explicitly included region aren't found?
>>> 
>>> I believe this should be the behavior. If any one explicitly listed
>> region does not exist an error should result.
>>> 
>>> -Jake
>>> 
>> 
>> 



Passed: apache/geode-native#2364 (moleske-patch-2 - df791c5)

2020-04-02 Thread Travis CI
Build Update for apache/geode-native
-

Build: #2364
Status: Passed

Duration: 1 hr, 17 mins, and 25 secs
Commit: df791c5 (moleske-patch-2)
Author: M. Oleske
Message: Fix broken Geode Image Link

View the changeset: https://github.com/apache/geode-native/commit/df791c5bfd08

View the full build log and details: 
https://travis-ci.org/github/apache/geode-native/builds/670248054?utm_medium=notification&utm_source=email

--

You can unsubscribe from build emails from the apache/geode-native repository 
going to 
https://travis-ci.org/account/preferences/unsubscribe?repository=11948127&utm_medium=notification&utm_source=email.
Or unsubscribe from *all* email updating your settings at 
https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email.
Or configure specific recipients for build notifications in your .travis.yml 
file. See https://docs.travis-ci.com/user/notifications.