[PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Ju@N
Hello devs,

I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
*support/1.13* branches.
The fix has been merged into develop through commit
bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make
sure we delete unused drfs upon initialization to prevent the proliferation
of unused records and files within the file system, which could cause
members to fail during startup while recovering disk-stores.
Best regards.

[1]: https://issues.apache.org/jira/browse/GEODE-8029
[2:]
https://github.com/apache/geode/commit/bc0090dc93643fd4d09c79a4b0c29d883172b546

-- 
Ju@N


Commit revert etiquette

2020-06-10 Thread Jens Deppe
Hello friends,

Just recently I’ve noticed a couple of PRs submitted that were intended to 
revert flaky or suspected commits. Initially, these PRs were simply marked as 
‘Revert’ without any further detail. I’m not sure if there was any additional 
out-of-band communication but I would like to request that that if you’re 
submitting a reversion you do the courtesy of providing additional details as 
to why the commit is being reverted. That way it is also obvious to others, who 
may not be aware of additional communication, what the reason is.

Thank you.
--Jens


Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Owen Nichols
+1

On 6/10/20, 3:18 AM, "Ju@N"  wrote:

Hello devs,

I'd like to propose bringing GEODE-8029 to the *support/1.12* and
*support/1.13* branches.
The fix has been merged into develop through commit
bc0090dc93643fd4d09c79a4b0c29d883172b546, and it's basically to make
sure we delete unused drfs upon initialization to prevent the proliferation
of unused records and files within the file system, which could cause
members to fail during startup while recovering disk-stores.
Best regards.

-- 
Ju@N



Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Udo Kohlmeyer
+1
On Jun 10, 2020, 3:18 AM -0700, Ju@N , wrote:
Hello devs,

I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
*support/1.13* branches.
The fix has been merged into develop through commit
bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make
sure we delete unused drfs upon initialization to prevent the proliferation
of unused records and files within the file system, which could cause
members to fail during startup while recovering disk-stores.
Best regards.

[1]: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8029&data=02%7C01%7Cudo%40vmware.com%7Cf1fe760bbe494bd991e508d80d27a97f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811297244101&sdata=rM4oeJQET%2FZAVp3O%2BqCLAz0dNe%2BvvStRGhea%2FutjIm0%3D&reserved=0
[2:]
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fcommit%2Fbc0090dc93643fd4d09c79a4b0c29d883172b546&data=02%7C01%7Cudo%40vmware.com%7Cf1fe760bbe494bd991e508d80d27a97f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811297244101&sdata=h94q99tJgHmuIr0MIcDMNahdqEy3KesVjfdEVD6Ei64%3D&reserved=0

--
Ju@N


RE: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Dick Cavender
+1

-Original Message-
From: Udo Kohlmeyer  
Sent: Wednesday, June 10, 2020 9:14 AM
To: dev@geode.apache.org
Subject: Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

+1
On Jun 10, 2020, 3:18 AM -0700, Ju@N , wrote:
Hello devs,

I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
*support/1.13* branches.
The fix has been merged into develop through commit
bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make sure 
we delete unused drfs upon initialization to prevent the proliferation of 
unused records and files within the file system, which could cause members to 
fail during startup while recovering disk-stores.
Best regards.

[1]: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8029&data=02%7C01%7Cdickc%40vmware.com%7Cac95fdc8b67c45d3931a08d80d5a2434%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274028089007755&sdata=zM3nWf6uqoQFBkmV6vmnH3Fr4hFhGh3blSxCmH%2ByCwU%3D&reserved=0
[2:]
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fcommit%2Fbc0090dc93643fd4d09c79a4b0c29d883172b546&data=02%7C01%7Cdickc%40vmware.com%7Cac95fdc8b67c45d3931a08d80d5a2434%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274028089007755&sdata=wT5C1SM7LpCu7pIo8xjwUM9VtFN7DHuxTeob20r61BE%3D&reserved=0

--
Ju@N


Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Joris Melchior
+1

From: Ju@N 
Sent: June 10, 2020 6:18
To: dev@geode.apache.org 
Subject: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

Hello devs,

I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
*support/1.13* branches.
The fix has been merged into develop through commit
bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make
sure we delete unused drfs upon initialization to prevent the proliferation
of unused records and files within the file system, which could cause
members to fail during startup while recovering disk-stores.
Best regards.

[1]: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8029&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=kw1jFlsIIdlhh%2FTxW%2BcZD2BVmfQzGSCgoW1xyRdKE4E%3D&reserved=0
[2:]
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fcommit%2Fbc0090dc93643fd4d09c79a4b0c29d883172b546&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=R412kMw3EXKl6%2FOgKP3OEZDuJJs%2F3uRIT0AZljdlpDo%3D&reserved=0

--
Ju@N


Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Dave Barnes
Looks like the community approves, Ju@n. Go ahead and merge.
Thanks,
Dave

On Wed, Jun 10, 2020 at 10:37 AM Joris Melchior 
wrote:

> +1
> 
> From: Ju@N 
> Sent: June 10, 2020 6:18
> To: dev@geode.apache.org 
> Subject: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13
>
> Hello devs,
>
> I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
> *support/1.13* branches.
> The fix has been merged into develop through commit
> bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make
> sure we delete unused drfs upon initialization to prevent the proliferation
> of unused records and files within the file system, which could cause
> members to fail during startup while recovering disk-stores.
> Best regards.
>
> [1]:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8029&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=kw1jFlsIIdlhh%2FTxW%2BcZD2BVmfQzGSCgoW1xyRdKE4E%3D&reserved=0
> [2:]
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fcommit%2Fbc0090dc93643fd4d09c79a4b0c29d883172b546&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=R412kMw3EXKl6%2FOgKP3OEZDuJJs%2F3uRIT0AZljdlpDo%3D&reserved=0
>
> --
> Ju@N
>


Re: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13

2020-06-10 Thread Ju@N
Thanks everyone!.
I've cherry picked the commit into branches support/1.12[1] and
support/1.13[2].
Best regards.

[1]:
https://github.com/apache/geode/commit/bdeff9d6144c47ea687cf3b7d25f12598228
[2]:
https://github.com/apache/geode/commit/7c1ffd528ff72b920dd606604de2a744c1728b23


On Wed, 10 Jun 2020 at 18:51, Dave Barnes  wrote:

> Looks like the community approves, Ju@n. Go ahead and merge.
> Thanks,
> Dave
>
> On Wed, Jun 10, 2020 at 10:37 AM Joris Melchior 
> wrote:
>
> > +1
> > 
> > From: Ju@N 
> > Sent: June 10, 2020 6:18
> > To: dev@geode.apache.org 
> > Subject: [PROPOSAL]: BackPort GEODE-8029 to support/1.12 and support/1.13
> >
> > Hello devs,
> >
> > I'd like to propose bringing GEODE-8029 [1] to the *support/1.12* and
> > *support/1.13* branches.
> > The fix has been merged into develop through commit
> > bc0090dc93643fd4d09c79a4b0c29d883172b546 [2], and it's basically to make
> > sure we delete unused drfs upon initialization to prevent the
> proliferation
> > of unused records and files within the file system, which could cause
> > members to fail during startup while recovering disk-stores.
> > Best regards.
> >
> > [1]:
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8029&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=kw1jFlsIIdlhh%2FTxW%2BcZD2BVmfQzGSCgoW1xyRdKE4E%3D&reserved=0
> > [2:]
> >
> >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fcommit%2Fbc0090dc93643fd4d09c79a4b0c29d883172b546&data=02%7C01%7Cjmelchior%40vmware.com%7C96432ff1699a4d2ed2a508d80d27a970%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637273811288277601&sdata=R412kMw3EXKl6%2FOgKP3OEZDuJJs%2F3uRIT0AZljdlpDo%3D&reserved=0
> >
> > --
> > Ju@N
> >
>


-- 
Ju@N


Re: Problem in rolling upgrade since 1.12

2020-06-10 Thread Bill Burcham
Ernie made us a ticket for this issue:
https://issues.apache.org/jira/browse/GEODE-8240

On Mon, Jun 8, 2020 at 12:59 PM Alberto Gomez 
wrote:

> Hi Ernie,
>
> I have seen this problem in the support/1.13 branch and also on develop.
>
> Interestingly, the patch I sent is applied seamlessly in my local repo set
> to the develop branch.
>
> The patch modifies the
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable test case by
> running "list members" on an upgraded system is
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable. I run it
> manually with the following command:
>
> ./gradlew geode-core:upgradeTest
> --tests=RollingUpgradeRollServersOnPartitionedRegion_dataserializable
>
> I see it failing when upgrading from 1.12.
>
> I created a draft PR where you can see also the changes in the test case
> that manifest the problem.
>
> See: https://github.com/apache/geode/pull/5224
>
>
> Please, let me know if you need any more information.
>
> BR,
>
> Alberto
> 
> From: Ernie Burghardt 
> Sent: Monday, June 8, 2020 9:04 PM
> To: dev@geode.apache.org 
> Subject: Re: Problem in rolling upgrade since 1.12
>
> Hi Alberto,
>
> I’m looking at this, came across a couple blockers…
> Do you have branch that exhibits this problem? Draft PR maybe?
> I tried to apply you patch to latest develop, but the patch doesn’t pass
> git apply’s check….
> Also these tests pass on develop, would you be able to check against the
> latest and update the diff?
> I’m very interested in reproducing the issue you have observed.
>
> Thanks,
> Ernie
>
> From: Alberto Gomez 
> Reply-To: "dev@geode.apache.org" 
> Date: Monday, June 8, 2020 at 12:31 AM
> To: "dev@geode.apache.org" 
> Subject: Re: Problem in rolling upgrade since 1.12
>
> Hi,
>
> I attach a diff for the modified test case in case you would like to use
> it to check the problem I mentioned.
>
> BR,
>
> Alberto
> 
> From: Alberto Gomez 
> Sent: Saturday, June 6, 2020 4:06 PM
> To: dev@geode.apache.org 
> Subject: Problem in rolling upgrade since 1.12
>
> Hi,
>
> I have observed that since version 1.12 rolling upgrades to future
> versions leave the first upgraded locator "as if" it was still on version
> 1.12.
>
> This is the output from "list members" before starting the upgrade from
> version 1.12:
>
> Name | Id
>  | ---
> vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
> vm0  | 192.168.0.37(vm0:29260):41002
> vm1  | 192.168.0.37(vm1:29296):41003
>
>
> And this is the output from "list members" after upgrading the first
> locator from 1.12 to 1.13/1.14:
>
> Name | Id
>  |
> 
> vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0)
> [Coordinator]
> vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
> vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)
>
>
> Finally this is the output in gfsh once the rolling upgrade has been
> completed (locators and servers upgraded):
>
> Name | Id
>  |
> 
> vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0)
> [Coordinator]
> vm0  | 192.168.0.37(vm0:2457):41002
> vm1  | 192.168.0.37(vm1:2576):41003
>
> I verified this by running manual tests and also by running the following
> upgrade test (had to stop it in the middle to connect via gfsh and get the
> gfsh outputs):
>
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable
>
> After the rolling upgrade, the shutdown command fails with the following
> error:
> Member 192.168.0.37(vm2:1453:locator):41001 could not be found.
> Please verify the member name or ID and try again.
>
> The only way I have found to come out of the situation is by restarting
> the locator.
> Once restarted again, the output of gfsh shows that all members are
> upgraded to the new version, i.e. the locator does not show anymore that it
> is on version GEODE 1.12.0.
>
> Anybody has any clue why this is happening?
>
> Thanks in advance,
>
> /Alberto G.
>


[INFO] Distributed Test runs in bulk results.

2020-06-10 Thread Mark Hanson
Hello All,

I have been doing bulk test runs of DistributedTestOpenJDK8, in this case over 
200. Here is a simplified report to kind of help you see what I am seeing and I 
think everybody sees with random failures as part of the PR process.

It is very easy to cause failures like this by not knowing what is running 
asynchronous and Geode is a complex system or introducing timing constraints 
that may not hold up in the system e.g. waiting 5 seconds for a test result 
that could take longer unbeknownst you.

All of that said, here are the results. There are tickets already open for most 
if not all of these issues.

Please let me know how often you all would like to see these reports…

Thanks,
Mark


***
Overall build success rate: 84.0%


The following test methods see failures in more than one class.  There may be a 
failing *TestBase class

*.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived: 
 18 failures :
  ParallelWANPersistenceEnabledGatewaySenderDUnitTest:  7 failures (96.889% 
success rate)
  ParallelWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  11 failures 
(95.111% success rate)

*.testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
  4 failures :
  SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  3 failures 
(98.667% success rate)
  SerialWANPersistenceEnabledGatewaySenderDUnitTest:  1 failures (99.556% 
success rate)

***


org.apache.geode.management.MemberMXBeanDistributedTest:  3 failures (98.667% 
success rate)

 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3247
 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3241
 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3199

org.apache.geode.internal.cache.wan.parallel.ParallelWANPersistenceEnabledGatewaySenderDUnitTest:
  7 failures (96.889% success rate)

 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3335
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3331
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3294
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3285
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3218
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3180
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3156

org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest:
  1 failures (99.556% success rate)

 testCacheCloseDuringBucketMoveDoesntCauseDataLoss   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3267

org.apache.geode.cache.management.MemoryThresholdsOffHeapDUnitTest:  1 failures 
(99.556% success rate)

 testDistributedRegionClientPutRejection   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3319

org.apache.geode.internal.cache.wan.offheap.SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:
  3 failures (98.667% success rate)

 
testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3239
 
testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams