RE: Monitor the neighbour JVM using neihbour's member-timeout

2017-11-03 Thread Aravind Musigumpula
Thanks Bruce for suggestions, I will change the new variables from 
InternalDistributedMember to NetView and do changes related to backward 
compatibility.

Now I know that there is another way that member can be removed from the view 
i.e if any member is sending a message and waits for ack-wait-threshold, if 
there is no response from the target the sender will do final check and remove 
it from the view if there is still no response. 
But I don't understand how deprecating the settings member-timeout, 
ack-wait-threshold, ack-severe-alert-threshold into one will solve the problem. 
The main problem is that we want a member to survive in the view for longer 
time than others.

If we deprecate the settings into one setting and pass the setting to 
monitoring member(say A), then it will use the target member(say B which we 
want to survive in view for longer time) timeout for health monitoring and 
ack-wait-threshold to wait for the response for any message before doing final 
check.
But what if some other member(say C) which is monitoring any other member(say 
D) have the member-timeout and ack-wait-threshold some smaller values. So if 
member C messages to B, C uses the smaller value of ack-wait-threshold(which is 
of member D) to get a response and does the final check again on basis of 
smaller member-timeout. So still member B can be kicked out of the view in 
small amount of time.

I think this can be solved simply if we use the member-timeout of suspected 
member in the final check where we establish TCP connection. We don't need to 
club those three settings as well. We can set the member-timeout of a 
particular member to a higher value and the member which monitors it uses its 
own member-timeout as it is now, but during the final check it uses the 
suspected member-timeout(which is a greater value). The final check is common 
place in both the no heartbeat scenario and no response for a message scenario.

Are there any concerns around this new proposal ?


Thanks,
Aravind Musigumpula 

-Original Message-
From: Bruce Schuchardt [mailto:bschucha...@pivotal.io]
Sent: Thursday, September 07, 2017 10:42 PM
To: dev@geode.apache.org
Subject: Re: Monitor the neighbour JVM using neihbour's member-timeout

I think this might be an acceptable change though I doubt many people would 
find it useful.

It's already possible to set different member-timeouts on each node of the 
distributed system but the meaning of the setting is the inverse of what's 
proposed here, so having the current setting be different in each node is 
pretty useless.

I think the initiation of suspect processing ought to be addressed if we make 
this change.  The ack-wait-threshold and ack-severe-alert-threshold aren't 
based on the member-timeout but ought to be.  This would make it possible to 
initiate suspect processing with different timing for different nodes.  It 
would still leave the question of slow backup operations hanging:  If you're 
waiting for one node that's blocked waiting for a response from another node 
(say a node holding a backup
bucket) you are going to initiate suspect processing on the node you're waiting 
on & not those other (backup) nodes.

Rolling upgrade will also be a problem since old members aren't going to cough 
up their member-timeout settings.  What should be used as a membership timeout 
for the old members during an upgrade?

If we proceed with this idea I'd prefer that we deprecate member-timeout, 
ack-wait-threshold and ack-severe-alert-threshold and have new settings with 
the "ack" settings being multiples of the new membership timeout setting.

Concerning the PR, it isn't acceptable in its current form. 
InternalDistributedMember identifiers are often transmitted in messages and 
increasing their size affects performance.  Any new member attributes need to 
be added to NetView instead of InternalDistributedMember.


On 8/22/17 12:35 AM, Aravind Musigumpula wrote:
> Hi Team,
>
> We have a requirement to configure  different member timeout for different 
> members as we need some members to survive in the view for longer time than 
> the other the members before being kicked out of the view in case they aren't 
> responding.
>
>
> 1.   Now with the current monitoring system it is not possible to 
> determine when the member will be kicked out of the view if we configure 
> different member-timeout's for some required members.
>
> 2.   Because if a member is not responding to any heartbeat requests, the 
> member who is monitoring the non-responding member will initiate check member 
> request.
>
> 3.   In this check member request monitoring member pings the 
> non-responding member and waits for member-timeout of monitoring member for a 
> response.
>
> 4.   If still there is no response, it will initiate a final suspect 
> request to coordinator where the coordinator does the final check waiting for 
> coordinators member-timeout.
>
> 5.   If coordinator did not g

':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Kirk Lund
Anyone know why the build is failing in geode-assembly?

* What went wrong:
A problem occurred evaluating project ':geode-assembly'.
> Project with path ':geode-client-protocol' could not be found in project
':geode-assembly'.


Re: ':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Galen O'Sullivan
It may be that we missed one of the places where a module gets linked in.
Where did you see this error?

On Fri, Nov 3, 2017 at 9:20 AM, Kirk Lund  wrote:

> Anyone know why the build is failing in geode-assembly?
>
> * What went wrong:
> A problem occurred evaluating project ':geode-assembly'.
> > Project with path ':geode-client-protocol' could not be found in project
> ':geode-assembly'.
>


Re: ':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Kirk Lund
Precheckin last night failed to compile.

On Fri, Nov 3, 2017 at 9:22 AM, Galen O'Sullivan 
wrote:

> It may be that we missed one of the places where a module gets linked in.
> Where did you see this error?
>
> On Fri, Nov 3, 2017 at 9:20 AM, Kirk Lund  wrote:
>
> > Anyone know why the build is failing in geode-assembly?
> >
> > * What went wrong:
> > A problem occurred evaluating project ':geode-assembly'.
> > > Project with path ':geode-client-protocol' could not be found in
> project
> > ':geode-assembly'.
> >
>


Re: ':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Galen O'Sullivan
Can you show me the particular error? Does it persist? As far as I can
remember, we haven't changed that module this week.

On Fri, Nov 3, 2017 at 9:24 AM, Kirk Lund  wrote:

> Precheckin last night failed to compile.
>
> On Fri, Nov 3, 2017 at 9:22 AM, Galen O'Sullivan 
> wrote:
>
> > It may be that we missed one of the places where a module gets linked in.
> > Where did you see this error?
> >
> > On Fri, Nov 3, 2017 at 9:20 AM, Kirk Lund  wrote:
> >
> > > Anyone know why the build is failing in geode-assembly?
> > >
> > > * What went wrong:
> > > A problem occurred evaluating project ':geode-assembly'.
> > > > Project with path ':geode-client-protocol' could not be found in
> > project
> > > ':geode-assembly'.
> > >
> >
>


Re: ':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Anthony Baker
What is the difference between these two submodules:

geode-protobuf
geode-client-protocol

?

Thanks,
Anthony

> On Nov 3, 2017, at 9:22 AM, Galen O'Sullivan  wrote:
> 
> It may be that we missed one of the places where a module gets linked in.
> Where did you see this error?
> 
> On Fri, Nov 3, 2017 at 9:20 AM, Kirk Lund  wrote:
> 
>> Anyone know why the build is failing in geode-assembly?
>> 
>> * What went wrong:
>> A problem occurred evaluating project ':geode-assembly'.
>>> Project with path ':geode-client-protocol' could not be found in project
>> ':geode-assembly'.
>> 



[DISCUSS] Maximum duration that a class may contain a @Flaky

2017-11-03 Thread Patrick Rhomberg
Hello, all!

  I was considering doing some git archeology centered around identifying
how long a any given test class containing a @Flaky has had that
annotation.  Ultimately, I think it would be good to add a test that would
fail when any one test has been flaky for too long.  I feel like many of
our flaky tests have fallen by the wayside, and this could provide the
impetus to resolve these issues in a timely fashion.
  This leads naturally to the question: How long should a test be allowed
to remain marked Flaky?  Certainly, flaky tests are most often of the
non-deterministic, hard-to-reproduce variety, so some leeway is deserved.
Two weeks?  One month?
  Thoughts?

Imagination is Change.
~Patrick Rhomberg


Re: ':geode-client-protocol' could not be found in project ':geode-assembly'

2017-11-03 Thread Udo Kohlmeyer
Whilst working on the protobuf protocol, it was found that there was a
common set of "client protocol" stuff that was not protobuf specific.
Splitting out these common pieces of code, now allows for a cleaner, more
targeted protobuf (serialization) specific implementation.
It also provides a platform where the community can implement their own
client serialization mechanism, (i.e messagePack, avro ) with minimal
effort and code invasiveness.

Geode-protobuf is the protobuf specific implementation of the client
protocol.

--Udo


On Fri, Nov 3, 2017 at 10:20 AM, Anthony Baker  wrote:

> What is the difference between these two submodules:
>
> geode-protobuf
> geode-client-protocol
>
> ?
>
> Thanks,
> Anthony
>
> > On Nov 3, 2017, at 9:22 AM, Galen O'Sullivan 
> wrote:
> >
> > It may be that we missed one of the places where a module gets linked in.
> > Where did you see this error?
> >
> > On Fri, Nov 3, 2017 at 9:20 AM, Kirk Lund  wrote:
> >
> >> Anyone know why the build is failing in geode-assembly?
> >>
> >> * What went wrong:
> >> A problem occurred evaluating project ':geode-assembly'.
> >>> Project with path ':geode-client-protocol' could not be found in
> project
> >> ':geode-assembly'.
> >>
>
>


-- 
Kindest Regards
-
*Udo Kohlmeyer* | *Pivotal*
ukohlme...@pivotal.io

www.pivotal.io


Re: [DISCUSS] Maximum duration that a class may contain a @Flaky

2017-11-03 Thread Nabarun Nag
I think majority of the flaky test tags were put in one shot in one commit.
So the timer will expire on all tests in one shot.
Also we have stopped marking things flaky, if something fails in CI, we
immediately try to fix it. If there is a flakiness element in the test, the
test is immediately modified. And slowly we are also cleaning up the
existing flaky tests.

Regards
Naba






On Fri, Nov 3, 2017 at 10:33 AM Patrick Rhomberg 
wrote:

> Hello, all!
>
>   I was considering doing some git archeology centered around identifying
> how long a any given test class containing a @Flaky has had that
> annotation.  Ultimately, I think it would be good to add a test that would
> fail when any one test has been flaky for too long.  I feel like many of
> our flaky tests have fallen by the wayside, and this could provide the
> impetus to resolve these issues in a timely fashion.
>   This leads naturally to the question: How long should a test be allowed
> to remain marked Flaky?  Certainly, flaky tests are most often of the
> non-deterministic, hard-to-reproduce variety, so some leeway is deserved.
> Two weeks?  One month?
>   Thoughts?
>
> Imagination is Change.
> ~Patrick Rhomberg
>


Build failed in Jenkins: Geode-nightly #1002

2017-11-03 Thread Apache Jenkins Server
See 


Changes:

[github] GEODE-3778: mark tests flaky (#1004)

[github] GEODE-3936: remove ThreadUtil (#998)

[jdeppe] Add files necessary for Concourse CI infrastructure. (#1006)

[jdeppe] Fix up branches. GEODE-3942 (#1008)

[kohlmu-pivotal] GEODE-3637: Moved client queue initialization into the

[kohlmu-pivotal] GEODE-3637: Amended AcceptorImpl.java to use a Connection pool 
that

[gosullivan] GEODE-3895: Add Handshake/Message version byte (#1001)

[metatype] Fix version number in email subject.

[gosullivan] GEODE-3895: fixup: Add exceptions to excludedClasses.txt

[github] GEODE-3941: Pulse issues when SecurityManager is enabled (#1007)

--
[...truncated 143.03 KB...]
:geode-core:distributedTest
:geode-core:integrationTest
:geode-cq:assemble
:geode-cq:compileTestJavaNote: Some input files use or override a deprecated 
API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:geode-cq:processTestResources
:geode-cq:testClasses
:geode-cq:checkMissedTests
:geode-cq:spotlessJavaCheck
:geode-cq:spotlessCheck
:geode-cq:test
:geode-cq:check
:geode-cq:build
:geode-cq:distributedTest
:geode-cq:integrationTest
:geode-json:assemble
:geode-json:compileTestJava NO-SOURCE
:geode-json:processTestResources
:geode-json:testClasses
:geode-json:checkMissedTests NO-SOURCE
:geode-json:spotlessJavaCheck
:geode-json:spotlessCheck
:geode-json:test NO-SOURCE
:geode-json:check
:geode-json:build
:geode-json:distributedTest NO-SOURCE
:geode-json:integrationTest NO-SOURCE
:geode-junit:javadoc
:geode-junit:javadocJar
:geode-junit:sourcesJar
:geode-junit:signArchives SKIPPED
:geode-junit:assemble
:geode-junit:compileTestJavaNote: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: 

 uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:geode-junit:processTestResources
:geode-junit:testClasses
:geode-junit:checkMissedTests
:geode-junit:spotlessJavaCheck
:geode-junit:spotlessCheck
:geode-junit:test
:geode-junit:check
:geode-junit:build
:geode-junit:distributedTest
:geode-junit:integrationTest
:geode-lucene:assemble
:geode-lucene:compileTestJavaNote: Some input files use or override a 
deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:geode-lucene:processTestResources
:geode-lucene:testClasses
:geode-lucene:checkMissedTests
:geode-lucene:spotlessJavaCheck
:geode-lucene:spotlessCheck
:geode-lucene:test
:geode-lucene:check
:geode-lucene:build
:geode-lucene:distributedTest
:geode-lucene:integrationTest
:geode-old-client-support:assemble
:geode-old-client-support:compileTestJavaNote: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

:geode-old-client-support:processTestResources NO-SOURCE
:geode-old-client-support:testClasses
:geode-old-client-support:checkMissedTests
:geode-old-client-support:spotlessJavaCheck
:geode-old-client-support:spotlessCheck
:geode-old-client-support:test
:geode-old-client-support:check
:geode-old-client-support:build
:geode-old-client-support:distributedTest
:geode-old-client-support:integrationTest
:geode-old-versions:distributedTest NO-SOURCE
:geode-old-versions:integrationTest NO-SOURCE
:geode-protobuf:assemble
:geode-protobuf:extractIncludeTestProto
:geode-protobuf:extractTestProto UP-TO-DATE
:geode-protobuf:generateTestProto NO-SOURCE
:geode-protobuf:compileTestJavaNote: Some input files use or override a 
deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:geode-protobuf:processTestResources
:geode-protobuf:testClasses
:geode-protobuf:checkMissedTests
:geode-protobuf:spotlessJavaCheck
:geode-protobuf:spotlessCheck
:geode-protobuf:test
:geode-protobuf:check
:geode-protobuf:build
:geode-protobuf:distributedTest
:geode-protobuf:integrationTest
:geode-pulse:assemble
:geode-pulse:compileTestJavaNote: Some input files use or override a deprecated 
API.
Note: Recompile with -Xlint:deprecation for details.
Note: 

 uses unchecked or unsafe operations.
Note: Recompil

[Spring CI] Spring Data GemFire > Nightly-ApacheGeode > #729 was SUCCESSFUL (with 2187 tests)

2017-11-03 Thread Spring CI

---
Spring Data GemFire > Nightly-ApacheGeode > #729 was successful.
---
Scheduled
2189 tests in total.

https://build.spring.io/browse/SGF-NAG-729/





--
This message is automatically generated by Atlassian Bamboo

Build failed in Jenkins: Geode-nightly-flaky #166

2017-11-03 Thread Apache Jenkins Server
See 


Changes:

[github] GEODE-3778: mark tests flaky (#1004)

[github] GEODE-3936: remove ThreadUtil (#998)

[jdeppe] Add files necessary for Concourse CI infrastructure. (#1006)

[jdeppe] Fix up branches. GEODE-3942 (#1008)

[kohlmu-pivotal] GEODE-3637: Moved client queue initialization into the

[kohlmu-pivotal] GEODE-3637: Amended AcceptorImpl.java to use a Connection pool 
that

[gosullivan] GEODE-3895: Add Handshake/Message version byte (#1001)

[metatype] Fix version number in email subject.

[gosullivan] GEODE-3895: fixup: Add exceptions to excludedClasses.txt

[github] GEODE-3941: Pulse issues when SecurityManager is enabled (#1007)

[github] GEODE-3947: add the necessary dependency in geode-dependency.jar 
(#1010)

[github] GEODE-3870: clean up region entry classes (#989)

[dbarnes] User Guide: fixed typo in the ‘configuring’ section

--
[...truncated 111.46 KB...]
Download 
https://repo1.maven.org/maven2/com/fasterxml/jackson/module/jackson-module-scala_2.10/2.8.6/jackson-module-scala_2.10-2.8.6.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-swagger2/2.6.1/springfox-swagger2-2.6.1.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-swagger-ui/2.6.1/springfox-swagger-ui-2.6.1.jar
Download 
https://repo1.maven.org/maven2/org/springframework/hateoas/spring-hateoas/0.23.0.RELEASE/spring-hateoas-0.23.0.RELEASE.jar
Download 
https://repo1.maven.org/maven2/com/fasterxml/jackson/module/jackson-module-paranamer/2.8.6/jackson-module-paranamer-2.8.6.jar
Download 
https://repo1.maven.org/maven2/io/swagger/swagger-annotations/1.5.10/swagger-annotations-1.5.10.jar
Download 
https://repo1.maven.org/maven2/io/swagger/swagger-models/1.5.10/swagger-models-1.5.10.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-spi/2.6.1/springfox-spi-2.6.1.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-schema/2.6.1/springfox-schema-2.6.1.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-swagger-common/2.6.1/springfox-swagger-common-2.6.1.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-spring-web/2.6.1/springfox-spring-web-2.6.1.jar
Download 
https://repo1.maven.org/maven2/org/springframework/plugin/spring-plugin-core/1.2.0.RELEASE/spring-plugin-core-1.2.0.RELEASE.jar
Download 
https://repo1.maven.org/maven2/org/springframework/plugin/spring-plugin-metadata/1.2.0.RELEASE/spring-plugin-metadata-1.2.0.RELEASE.jar
Download 
https://repo1.maven.org/maven2/org/mapstruct/mapstruct/1.0.0.Final/mapstruct-1.0.0.Final.jar
Download 
https://repo1.maven.org/maven2/com/thoughtworks/paranamer/paranamer/2.8/paranamer-2.8.jar
Download 
https://repo1.maven.org/maven2/io/springfox/springfox-core/2.6.1/springfox-core-2.6.1.jar
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
:geode-web-api:processResources
:geode-web-api:classes
:geode-assembly:docs
:geode-assembly:gfshDepsJar
:geode-client-protocol:javadoc
:geode-client-protocol:javadocJar
:geode-client-protocol:sourcesJar
:geode-client-protocol:signArchives SKIPPED
:geode-common:javadocJar
:geode-common:sourcesJar
:geode-common:signArchives SKIPPED
:geode-core:javadocJar
:geode-core:raJar
:geode-core:jcaJar
:geode-core:sourcesJar
:geode-core:signArchives SKIPPED
:geode-core:webJar
:geode-cq:jar
:geode-cq:javadoc
:geode-cq:javadocJar
:geode-cq:sourcesJar
:geode-cq:signArchives SKIPPED
:geode-json:javadocJar
:geode-json:sourcesJar
:geode-json:signArchives SKIPPED
:geode-lucene:jar
:geode-lucene:javadoc
:geode-lucene:javadocJar
:geode-lucene:sourcesJar
:geode-lucene:signArchives SKIPPED
:geode-old-client-support:jar
:geode-old-client-support:javadoc
:geode-old-client-support:javadocJar
:geode-old-client-support:sourcesJar
:geode-old-client-support:signArchives SKIPPED
:geode-protobuf:jar
:geode-protobuf:javadoc
:geode-protobuf:javadocJar
:geode-protobuf:sourcesJar
:geode-protobuf:signArchives SKIPPED
:geode-protobuf:zip
:geode-pulse:javadoc
:geode-pulse:javadocJar
:geode-pulse:sourcesJar
:geode-pulse:war
:geode-pulse:signArchives SKIPPED
:geode-rebalancer:jar
:geode-rebalancer:javadoc
:geode-rebalancer:javadocJar
:geode-rebalancer:sourcesJar
:geode-rebalancer:signArchives SKIPPED
:geode-wan:jar
:geode-wan:javadoc
:geode-wan:javadocJar
:geode-wan:sourcesJar
:geode-wan:signArchives SKIPPED
:geode-web:javadoc NO-SOURCE
:geode-web:javadocJar
:geode-web:sourcesJar
:geode-web:war
:geode-web:signArchives SKIPPED
:geode-web-api:javadoc
:geode-web-api:javadocJar
:geode-web-api:sourcesJar
:geode-web-api:war
:geode-web-api:signArchives SKIPPED
:geode-assembly:installDist
:geode-pulse:jar
:geode-assembly:compileTestJava
Download 
https://repo1.maven.org/maven2/org/codehaus/cargo/cargo-core-uberjar/1.6.3/cargo-core-uberjar-1.6.3.pom
Download 
https://repo1.maven.org/maven2/org/codehaus/cargo/cargo-core/1.6.3/cargo-co