Re: [PROPOSAL]: configure pdx command mandatory flags

2018-03-08 Thread Juan José Ramos
Hello Dan,

Thanks, you're right... instead of making the parameters mandatory, the
command should be changed to generate a valid XML to persist in the cluster
configuration service instead. I'll change GEODE-4794 and work on the fix,
thanks again.
Best regards.


On Wed, Mar 7, 2018 at 7:44 PM, Dan Smith  wrote:

> I think the configure pdx command also lets you configure persistence or
> read-serialized for pdx, right? So some people might not be using either of
> those auto-serializable-classes options.
>
> -Dan
>
> On Wed, Mar 7, 2018 at 4:13 AM, Ju@N  wrote:
>
> > Hello all,
> >
> > While working on a fix for  GEODE-4771
> >  I've came across a
> > non-reported bug: the configure pdx command fails when no parameters are
> > specified, even when we state in the User Guide
> >  > command-pages/configure.html>
> > that
> > no parameters are mandatory to execute the command. The source code
> doesn't
> > enforce any of the parameters and, as such, the resulting XmlEntity ends
> up
> > being empty, so a NullPointerException is thrown when the command tries
> to
> > persist the changes to the cluster configuration service:
> >
> > [error 2018/03/07 11:07:48.242 GMT locator1  > Connection(2)-127.0.0.1> tid=0x55] error updating cluster
> > configuration for group cluster
> > java.lang.NullPointerException
> > at java.io.StringReader.(StringReader.java:50)
> > at org.apache.geode.management.internal.configuration.utils.
> > XmlUtils.createNode(XmlUtils.java:242)
> > at org.apache.geode.management.internal.configuration.utils.
> > XmlUtils.addNewNode(XmlUtils.java:133)
> > at org.apache.geode.distributed.internal.
> > ClusterConfigurationService.addXmlEntity(ClusterConfigurationService.
> > java:204)
> > at org.apache.geode.management.internal.cli.commands.
> > ConfigurePDXCommand.lambda$configurePDX$0(ConfigurePDXCommand.java:131)
> > at org.apache.geode.management.internal.cli.commands.
> GfshCommand.
> > persistClusterConfiguration(GfshCommand.java:72)
> > at org.apache.geode.management.internal.cli.commands.
> > ConfigurePDXCommand.configurePDX(ConfigurePDXCommand.java:130)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:62)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at org.springframework.util.ReflectionUtils.invokeMethod(
> > ReflectionUtils.java:216)
> > at org.apache.geode.management.internal.cli.remote.
> > CommandExecutor.invokeCommand(CommandExecutor.java:97)
> > at org.apache.geode.management.internal.cli.remote.
> > CommandExecutor.execute(CommandExecutor.java:45)
> > at org.apache.geode.management.internal.cli.remote.
> > CommandExecutor.execute(CommandExecutor.java:39)
> > at org.apache.geode.management.internal.cli.remote.
> > OnlineCommandProcessor.executeCommand(OnlineCommandProcessor.java:133)
> > at org.apache.geode.management.internal.beans.MemberMBeanBridge.
> > processCommand(MemberMBeanBridge.java:1579)
> > at org.apache.geode.management.internal.beans.MemberMBean.
> > processCommand(MemberMBean.java:412)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:62)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
> > at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
> > at com.sun.jmx.mbeanserver.ConvertingMethod.
> invokeWithOpenReturn(
> > ConvertingMethod.java:193)
> > at com.sun.jmx.mbeanserver.ConvertingMethod.
> invokeWithOpenReturn(
> > ConvertingMethod.java:175)
> > at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(
> > MXBeanIntrospector.java:117)
> > at com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(
> > MXBeanIntrospector.java:54)
> > at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(
> > MBeanIntrospector.java:237)
> > at com.sun.jmx.mbeanserver.PerInterface.invoke(
> > PerInterface.java:138)
> > at com.sun.jmx.mbeanserver.MBeanSupport.invoke(
> > MBeanSupport.java:252)
> > at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(
> > DefaultMBeanServerInterceptor.java:819)

Geode unit tests completed in 'develop/DistributedTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/185



Geode unit tests completed in 'develop/FlakyTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/FlakyTest/builds/292



[PROPOSAL]: concurrent bucket moves during rebalance

2018-03-08 Thread Nick Reich
Team,

The time required to undertake a rebalance of a geode cluster has often
been an area for improvement noted by users. Currently, buckets are moved
one at a time and we propose that creating a system that moved buckets in
parallel could greatly improve performance for this feature.

Previously, parallelization was implemented for adding redundant copies of
buckets to restore redundancy. However, moving buckets is a more
complicated matter and requires a different approach than restoration of
redundancy. The reason for this is that members could be potentially both
be gaining buckets and giving away buckets at the same time. While giving
away a bucket, that member still has all of the data for the bucket, until
the receiving member has fully received the bucket and it can safely be
removed from the original owner. This means that unless the member has the
memory overhead to store all of the buckets it will receive and all the
buckets it started with, there is potential that parallel moving of buckets
could cause the member to run out of memory.

For this reason, we propose a system that does (potentially) several rounds
of concurrent bucket moves:
1) A set of moves is calculated to improve balance that meet a requirement
that no member both receives and gives away a bucket (no member will have
memory overhead of an existing bucket it is ultimately removing and a new
bucket).
2) Conduct all calculated bucket moves in parallel. Parameters to throttle
this process (to prevent taking too many cluster resources, impacting
performance) should be added, such as only allowing each member to either
receive or send a maximum number of buckets concurrently.
3) If cluster is not yet balanced, perform additional iterations of
calculating and conducting bucket moves, until balance is achieved or a
possible maximum iterations is reached.
Note: in both the existing and proposed system, regions are rebalanced one
at a time.

Please let us know if you have feedback on this approach or additional
ideas that should be considered.


Re: [PROPOSAL]: concurrent bucket moves during rebalance

2018-03-08 Thread Michael Stolz
We should be very careful about how much resource we dedicate to
rebalancing.

One of our competitors rebalances *much* faster than we do, but in doing so
they consume all available resources.

At one bank that caused significant loss of incoming market data that was
coming in on a multicast feed, which had a severe adverse effect on the
pricing and risk management functions for a period of time. That bank
removed the competitor's product and for several years no distributed
caching was allowed by the chief architect at that bank. Until he left and
a new chief architect was named they didn't use any distributed caching
products. When they DID go back to using them, it pre-dated Geode, so they
used GemFire largely because GemFire does not consume all available
resources while rebalancing.

I do think we need to improve our rebalancing such that it iterates until
it achieves balance, but not in a way that will consume all available
resources.

--
Mike Stolz


On Thu, Mar 8, 2018 at 2:25 PM, Nick Reich  wrote:

> Team,
>
> The time required to undertake a rebalance of a geode cluster has often
> been an area for improvement noted by users. Currently, buckets are moved
> one at a time and we propose that creating a system that moved buckets in
> parallel could greatly improve performance for this feature.
>
> Previously, parallelization was implemented for adding redundant copies of
> buckets to restore redundancy. However, moving buckets is a more
> complicated matter and requires a different approach than restoration of
> redundancy. The reason for this is that members could be potentially both
> be gaining buckets and giving away buckets at the same time. While giving
> away a bucket, that member still has all of the data for the bucket, until
> the receiving member has fully received the bucket and it can safely be
> removed from the original owner. This means that unless the member has the
> memory overhead to store all of the buckets it will receive and all the
> buckets it started with, there is potential that parallel moving of buckets
> could cause the member to run out of memory.
>
> For this reason, we propose a system that does (potentially) several rounds
> of concurrent bucket moves:
> 1) A set of moves is calculated to improve balance that meet a requirement
> that no member both receives and gives away a bucket (no member will have
> memory overhead of an existing bucket it is ultimately removing and a new
> bucket).
> 2) Conduct all calculated bucket moves in parallel. Parameters to throttle
> this process (to prevent taking too many cluster resources, impacting
> performance) should be added, such as only allowing each member to either
> receive or send a maximum number of buckets concurrently.
> 3) If cluster is not yet balanced, perform additional iterations of
> calculating and conducting bucket moves, until balance is achieved or a
> possible maximum iterations is reached.
> Note: in both the existing and proposed system, regions are rebalanced one
> at a time.
>
> Please let us know if you have feedback on this approach or additional
> ideas that should be considered.
>


Re: [PROPOSAL]: concurrent bucket moves during rebalance

2018-03-08 Thread Nick Reich
Mike,

I think having a good default value for maximum parallel operations will
play a role in not consuming too many resources. Perhaps defaulting to only
a single (or other small number based on testing) parallel action(s) per
member at a time and allowing users that want better performance to
increase that number would be a good start. That should result in
performance improvements, but not place increased burden on any specific
member. Especially when bootstrapping new members, relance speed may be
more valuable than usual, so making it possible to configure on a per
rebalance action level would be prefered.

One clarification from my original proposal: regions can already be
rebalanced in parallel, depending on the value of resource.manager.threads
(which defaults to 1, so no parallelization or regions in the default case).

On Thu, Mar 8, 2018 at 11:46 AM, Michael Stolz  wrote:

> We should be very careful about how much resource we dedicate to
> rebalancing.
>
> One of our competitors rebalances *much* faster than we do, but in doing so
> they consume all available resources.
>
> At one bank that caused significant loss of incoming market data that was
> coming in on a multicast feed, which had a severe adverse effect on the
> pricing and risk management functions for a period of time. That bank
> removed the competitor's product and for several years no distributed
> caching was allowed by the chief architect at that bank. Until he left and
> a new chief architect was named they didn't use any distributed caching
> products. When they DID go back to using them, it pre-dated Geode, so they
> used GemFire largely because GemFire does not consume all available
> resources while rebalancing.
>
> I do think we need to improve our rebalancing such that it iterates until
> it achieves balance, but not in a way that will consume all available
> resources.
>
> --
> Mike Stolz
>
>
> On Thu, Mar 8, 2018 at 2:25 PM, Nick Reich  wrote:
>
> > Team,
> >
> > The time required to undertake a rebalance of a geode cluster has often
> > been an area for improvement noted by users. Currently, buckets are moved
> > one at a time and we propose that creating a system that moved buckets in
> > parallel could greatly improve performance for this feature.
> >
> > Previously, parallelization was implemented for adding redundant copies
> of
> > buckets to restore redundancy. However, moving buckets is a more
> > complicated matter and requires a different approach than restoration of
> > redundancy. The reason for this is that members could be potentially both
> > be gaining buckets and giving away buckets at the same time. While giving
> > away a bucket, that member still has all of the data for the bucket,
> until
> > the receiving member has fully received the bucket and it can safely be
> > removed from the original owner. This means that unless the member has
> the
> > memory overhead to store all of the buckets it will receive and all the
> > buckets it started with, there is potential that parallel moving of
> buckets
> > could cause the member to run out of memory.
> >
> > For this reason, we propose a system that does (potentially) several
> rounds
> > of concurrent bucket moves:
> > 1) A set of moves is calculated to improve balance that meet a
> requirement
> > that no member both receives and gives away a bucket (no member will have
> > memory overhead of an existing bucket it is ultimately removing and a new
> > bucket).
> > 2) Conduct all calculated bucket moves in parallel. Parameters to
> throttle
> > this process (to prevent taking too many cluster resources, impacting
> > performance) should be added, such as only allowing each member to either
> > receive or send a maximum number of buckets concurrently.
> > 3) If cluster is not yet balanced, perform additional iterations of
> > calculating and conducting bucket moves, until balance is achieved or a
> > possible maximum iterations is reached.
> > Note: in both the existing and proposed system, regions are rebalanced
> one
> > at a time.
> >
> > Please let us know if you have feedback on this approach or additional
> > ideas that should be considered.
> >
>


Geode unit tests completed in 'develop/AcceptanceTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/AcceptanceTest/builds/370



Geode unit tests completed in 'develop/FlakyTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/FlakyTest/builds/294



[Spring CI] Spring Data GemFire > Nightly-ApacheGeode > #850 was SUCCESSFUL (with 2378 tests)

2018-03-08 Thread Spring CI

---
Spring Data GemFire > Nightly-ApacheGeode > #850 was successful.
---
Scheduled
2380 tests in total.

https://build.spring.io/browse/SGF-NAG-850/





--
This message is automatically generated by Atlassian Bamboo

Geode unit tests completed in 'develop/AcceptanceTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/AcceptanceTest/builds/371



Geode unit tests completed in 'develop/AcceptanceTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/AcceptanceTest/builds/372



Geode unit tests completed in 'develop/AcceptanceTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/AcceptanceTest/builds/374



Geode unit tests completed in 'develop/FlakyTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/FlakyTest/builds/295



Geode unit tests completed in 'develop/DistributedTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/187



Re: FunctionService Proposal

2018-03-08 Thread Galen O'Sullivan
+1 for making the function service not static, and splitting servers from
clients.

Also +1 for Dan's suggestion.

On Wed, Mar 7, 2018 at 2:51 PM, Patrick Rhomberg 
wrote:

> I did not know that!  And then, yes, onRegion is much better.
>
> On Wed, Mar 7, 2018 at 2:43 PM, Dan Smith  wrote:
>
> > > If we're not opposed to descriptive verbosity, I might prefer
> > "onServersHostingRegion" more than "onRegion".
> >
> > onRegion does not really mean "on the servers hosting region XXX". It
> only
> > executes on a subset of the servers, potentially with retries, until it
> has
> > covered that entire dataset once. So I think onRegion is more
> appropriate.
> >
> > -Dan
> >
> > On Wed, Mar 7, 2018 at 2:38 PM, Patrick Rhomberg 
> > wrote:
> >
> > > +1 for iteration towards better single responsibility design and more
> > > easily-digestible classes.
> > >
> > > Regarding method names, I think that there would be some good utility
> in
> > > having "onGroup" methods, as well.
> > > If we're not opposed to descriptive verbosity, I might prefer
> > > "onServersHostingRegion" more than "onRegion".
> > >
> > > On Wed, Mar 7, 2018 at 1:45 PM, Dan Smith  wrote:
> > >
> > > > Hi Udo,
> > > >
> > > > +1 for making the function service not static and spitting it into
> > client
> > > > and server FunctionService objects!
> > > >
> > > > We do have Cache and ClientCache right now. So I would recommend this
> > API
> > > > rather than putting two methods on Cache. Cache is already the the
> > server
> > > > side API.
> > > >
> > > > Cache {
> > > >   ServerFunctionService getFunctionService()
> > > > }
> > > >
> > > > ClientCache {
> > > >   ClientFunctionService getFunctionService()
> > > > }
> > > >
> > > > If at some point we split the client side API into a separate jar the
> > API
> > > > shouldn't need to change. If you don't like ClientFunctionService,
> > maybe
> > > > o.a.g.cache.client.execute.FunctionService? We would never want two
> > > > different versions of org.apache.geode.function.FunctionService.
> > People
> > > > wouldn't be able to test a client and server in the same JVM.
> > > >
> > > > Also, you might want to check out this (somewhat stalled) proposal -
> > > > https://cwiki.apache.org/confluence/display/GEODE/
> > > > Function+Service+Usability+Improvements. We had buy in on this on
> the
> > > dev
> > > > list but have not found cycles to actually do it yet. But maybe now
> is
> > > the
> > > > time?
> > > >
> > > > -Dan
> > > >
> > > > On Wed, Mar 7, 2018 at 11:18 AM, Udo Kohlmeyer 
> wrote:
> > > >
> > > > > Hi there Apache Dev's,
> > > > >
> > > > > Please look at the proposal to improve the FunctionService and
> remove
> > > the
> > > > > static invocation of it from within the Cache.
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/GEODE/Function+
> > > > > Service+Refactor+-+Removal+of+static-ness+and+splitting+of+
> > > > > client+and+server-side+FunctionService
> > > > >
> > > > > --Udo
> > > > >
> > > > >
> > > >
> > >
> >
>


Geode unit tests completed in 'develop/IntegrationTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/IntegrationTest/builds/264



Geode unit tests completed in 'develop/AcceptanceTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/AcceptanceTest/builds/375



Geode unit tests completed in 'develop/IntegrationTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/IntegrationTest/builds/265



Geode unit tests completed in 'develop/DistributedTest' with non-zero exit code

2018-03-08 Thread apachegeodeci
Pipeline results can be found at:

Concourse: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/188



[Spring CI] Spring Data GemFire > Nightly-ApacheGeode > #840 was SUCCESSFUL (with 2331 tests)

2018-03-08 Thread Spring CI

---
Spring Data GemFire > Nightly-ApacheGeode > #840 was successful (rerun once).
---
This build was rerun by John Blum.
2333 tests in total.

https://build.spring.io/browse/SGF-NAG-840/





--
This message is automatically generated by Atlassian Bamboo