RE: [INFO] Distributed Test runs in bulk results.

2020-06-11 Thread Alberto Bustamante Reyes
I think a report like this is very useful to have real data about which flaky 
tests fail more often.

It would be great if a report like this were automatically generated and 
updated after each CI execution. In a project I was working before a similar 
report was implemented and it was very useful for developers to check if a test 
case was failing in the past, and also to identify which were the flaky test 
cases that we should try to fix first.

De: Mark Hanson 
Enviado: jueves, 11 de junio de 2020 7:59
Para: dev@geode.apache.org 
Asunto: [INFO] Distributed Test runs in bulk results.

Hello All,

I have been doing bulk test runs of DistributedTestOpenJDK8, in this case over 
200. Here is a simplified report to kind of help you see what I am seeing and I 
think everybody sees with random failures as part of the PR process.

It is very easy to cause failures like this by not knowing what is running 
asynchronous and Geode is a complex system or introducing timing constraints 
that may not hold up in the system e.g. waiting 5 seconds for a test result 
that could take longer unbeknownst you.

All of that said, here are the results. There are tickets already open for most 
if not all of these issues.

Please let me know how often you all would like to see these reports…

Thanks,
Mark


***
Overall build success rate: 84.0%


The following test methods see failures in more than one class.  There may be a 
failing *TestBase class

*.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived: 
 18 failures :
  ParallelWANPersistenceEnabledGatewaySenderDUnitTest:  7 failures (96.889% 
success rate)
  ParallelWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  11 failures 
(95.111% success rate)

*.testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
  4 failures :
  SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  3 failures 
(98.667% success rate)
  SerialWANPersistenceEnabledGatewaySenderDUnitTest:  1 failures (99.556% 
success rate)

***


org.apache.geode.management.MemberMXBeanDistributedTest:  3 failures (98.667% 
success rate)

 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3247
 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3241
 testBucketCount   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3199

org.apache.geode.internal.cache.wan.parallel.ParallelWANPersistenceEnabledGatewaySenderDUnitTest:
  7 failures (96.889% success rate)

 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3335
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3331
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3294
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3285
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3218
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3180
 
testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived
   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3156

org.apache.geode.internal.cache.partitioned.PersistentPartitionedRegionDistributedTest:
  1 failures (99.556% success rate)

 testCacheCloseDuringBucketMoveDoesntCauseDataLoss   
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-mass-test-run-main/jobs/DistributedTestOpenJDK8/builds/3267

org.apache.geode.cache.management.MemoryThresholdsOffHeapDUnitTest:  1 failures 
(99.556% success rate)

 testDistributedRegionClientPutRejection   
https://concourse.apachegeode

Re: Problem in rolling upgrade since 1.12

2020-06-11 Thread Alberto Gomez
Thanks for the info, Bill.

I have found another issue in rolling upgrade since 1.12.

I have observed that when there are custom jars previously deployed, the 
locator is not able be started in the new version and the following exception 
is thrown:

Exception in thread "main" org.apache.geode.SerializationException: Could not 
create an instance of 
org.apache.geode.management.internal.configuration.domain.Configuration .

I have pushed another commit in the draft I sent before containing the new test 
case.

/Alberto G.

From: Bill Burcham 
Sent: Thursday, June 11, 2020 1:53 AM
To: dev@geode.apache.org 
Subject: Re: Problem in rolling upgrade since 1.12

Ernie made us a ticket for this issue:
https://issues.apache.org/jira/browse/GEODE-8240

On Mon, Jun 8, 2020 at 12:59 PM Alberto Gomez 
wrote:

> Hi Ernie,
>
> I have seen this problem in the support/1.13 branch and also on develop.
>
> Interestingly, the patch I sent is applied seamlessly in my local repo set
> to the develop branch.
>
> The patch modifies the
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable test case by
> running "list members" on an upgraded system is
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable. I run it
> manually with the following command:
>
> ./gradlew geode-core:upgradeTest
> --tests=RollingUpgradeRollServersOnPartitionedRegion_dataserializable
>
> I see it failing when upgrading from 1.12.
>
> I created a draft PR where you can see also the changes in the test case
> that manifest the problem.
>
> See: https://github.com/apache/geode/pull/5224
>
>
> Please, let me know if you need any more information.
>
> BR,
>
> Alberto
> 
> From: Ernie Burghardt 
> Sent: Monday, June 8, 2020 9:04 PM
> To: dev@geode.apache.org 
> Subject: Re: Problem in rolling upgrade since 1.12
>
> Hi Alberto,
>
> I’m looking at this, came across a couple blockers…
> Do you have branch that exhibits this problem? Draft PR maybe?
> I tried to apply you patch to latest develop, but the patch doesn’t pass
> git apply’s check….
> Also these tests pass on develop, would you be able to check against the
> latest and update the diff?
> I’m very interested in reproducing the issue you have observed.
>
> Thanks,
> Ernie
>
> From: Alberto Gomez 
> Reply-To: "dev@geode.apache.org" 
> Date: Monday, June 8, 2020 at 12:31 AM
> To: "dev@geode.apache.org" 
> Subject: Re: Problem in rolling upgrade since 1.12
>
> Hi,
>
> I attach a diff for the modified test case in case you would like to use
> it to check the problem I mentioned.
>
> BR,
>
> Alberto
> 
> From: Alberto Gomez 
> Sent: Saturday, June 6, 2020 4:06 PM
> To: dev@geode.apache.org 
> Subject: Problem in rolling upgrade since 1.12
>
> Hi,
>
> I have observed that since version 1.12 rolling upgrades to future
> versions leave the first upgraded locator "as if" it was still on version
> 1.12.
>
> This is the output from "list members" before starting the upgrade from
> version 1.12:
>
> Name | Id
>  | ---
> vm2  | 192.168.0.37(vm2:29367:locator):41001 [Coordinator]
> vm0  | 192.168.0.37(vm0:29260):41002
> vm1  | 192.168.0.37(vm1:29296):41003
>
>
> And this is the output from "list members" after upgrading the first
> locator from 1.12 to 1.13/1.14:
>
> Name | Id
>  |
> 
> vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0)
> [Coordinator]
> vm0  | 192.168.0.37(vm0:810):41002(version:GEODE 1.12.0)
> vm1  | 192.168.0.37(vm1:849):41003(version:GEODE 1.12.0)
>
>
> Finally this is the output in gfsh once the rolling upgrade has been
> completed (locators and servers upgraded):
>
> Name | Id
>  |
> 
> vm2  | 192.168.0.37(vm2:1453:locator):41001(version:GEODE 1.12.0)
> [Coordinator]
> vm0  | 192.168.0.37(vm0:2457):41002
> vm1  | 192.168.0.37(vm1:2576):41003
>
> I verified this by running manual tests and also by running the following
> upgrade test (had to stop it in the middle to connect via gfsh and get the
> gfsh outputs):
>
> RollingUpgradeRollServersOnPartitionedRegion_dataserializable.testRollServersOnPartitionedRegion_dataserializable
>
> After the rolling upgrade, the shutdown command fails with the following
> error:
> Member 192.168.0.37(vm2:1453:locator):41001 could not be found.
> Please verify the member name or ID and try again.
>
> The only way I have found to come out of the situation is by restarting
> the locator.
> Once restarted again, the output of gfsh shows that all members are
> upgraded to the new version, i.e. the locator does not show anymore that it
> is on version GEODE 1.12.0.
>
> Anybody has any clue why this is happening?
>
> Thanks in advance,
>
> /Alberto G.
>


New launcher acceptance tests

2020-06-11 Thread Kirk Lund
The new launcher acceptance tests that I merged to develop yesterday are
failing on Windows because they specify "java" instead of "java.exe". I'm
updating and will submit a PR very soon.


[PROPOSAL] backport GEODE-8238 to support/1.13

2020-06-11 Thread Bruce Schuchardt
This ticket concerns loss of a message and a subsequent hang during shutdown.  
The problem is also in support/1.13 and I’d like to backport to that branch.

https://issues.apache.org/jira/browse/GEODE-8238

It’s a small change to fix a problem introduced in changes for GEODE-7727.
https://issues.apache.org/jira/browse/GEODE-7727



Re: [PROPOSAL] backport GEODE-8238 to support/1.13

2020-06-11 Thread Owen Nichols
+1


---
Sent from Workspace ONE Boxer

On June 11, 2020 at 10:52:19 AM PDT, Bruce Schuchardt  wrote:
This ticket concerns loss of a message and a subsequent hang during shutdown.  
The problem is also in support/1.13 and I’d like to backport to that branch.

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8238&data=02%7C01%7Conichols%40vmware.com%7Cb9ad09e010eb4db04ee708d80e302e8a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637274947393572598&sdata=n%2BSzyytT7Au1PnOg%2FHAg6rRhvVCh5qE2skAw86FrkWw%3D&reserved=0

It’s a small change to fix a problem introduced in changes for GEODE-7727.
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-7727&data=02%7C01%7Conichols%40vmware.com%7Cb9ad09e010eb4db04ee708d80e302e8a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637274947393572598&sdata=v15axVBjy35B879hhz9DMKMkPFkk6yck%2FpIy4gRHaBw%3D&reserved=0



Re: [INFO] Distributed Test runs in bulk results.

2020-06-11 Thread Anthony Baker
Thanks Mark!  Can you highlight any surprises in these results compared to 
previous runs?

Anthony


> On Jun 10, 2020, at 10:59 PM, Mark Hanson  wrote:
> 
> Hello All,
> 
> I have been doing bulk test runs of DistributedTestOpenJDK8, in this case 
> over 200. Here is a simplified report to kind of help you see what I am 
> seeing and I think everybody sees with random failures as part of the PR 
> process.
> 
> It is very easy to cause failures like this by not knowing what is running 
> asynchronous and Geode is a complex system or introducing timing constraints 
> that may not hold up in the system e.g. waiting 5 seconds for a test result 
> that could take longer unbeknownst you.
> 
> All of that said, here are the results. There are tickets already open for 
> most if not all of these issues.
> 
> Please let me know how often you all would like to see these reports…
> 
> Thanks,
> Mark
> 
> 
> ***
> Overall build success rate: 84.0%
> 
> 
> The following test methods see failures in more than one class.  There may be 
> a failing *TestBase class
> 
> *.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
>   18 failures :
>  ParallelWANPersistenceEnabledGatewaySenderDUnitTest:  7 failures (96.889% 
> success rate)
>  ParallelWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  11 failures 
> (95.111% success rate)
> 
> *.testReplicatedRegionPersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived:
>   4 failures :
>  SerialWANPersistenceEnabledGatewaySenderOffHeapDUnitTest:  3 failures 
> (98.667% success rate)
>  SerialWANPersistenceEnabledGatewaySenderDUnitTest:  1 failures (99.556% 
> success rate)
> 
> ***
> 
> 
> org.apache.geode.management.MemberMXBeanDistributedTest:  3 failures (98.667% 
> success rate)
> 
> testBucketCount   
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3247&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730317409&sdata=HRdBnvmthfuddRqVjC8T9EzMufHQ1959MIAcPzJINm0%3D&reserved=0
> testBucketCount   
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3241&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730327404&sdata=DxqiSO%2BbdLgvEND5SdqiWUu8vpKGhNsOvKhWjFydxDA%3D&reserved=0
> testBucketCount   
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3199&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730327404&sdata=ZhrecVfpX9T5NJMFsuxhU0QQbx3LwJx3QjVzK3EvXbo%3D&reserved=0
> 
> org.apache.geode.internal.cache.wan.parallel.ParallelWANPersistenceEnabledGatewaySenderDUnitTest:
>   7 failures (96.889% success rate)
> 
> 
> testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived  
>  
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3335&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730327404&sdata=NzB40k6LwF5cj5D20XL7CyOO7eGAQHuWe0MUzfDKn%2BM%3D&reserved=0
> 
> testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived  
>  
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3331&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730327404&sdata=%2F4x18mauNSUWf5wng3AU%2B0RNjrMoa0M9X5hfohbf47I%3D&reserved=0
> 
> testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived  
>  
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-mass-test-run-main%2Fjobs%2FDistributedTestOpenJDK8%2Fbuilds%2F3294&data=02%7C01%7Cbakera%40vmware.com%7Cfddc96a2b1da4176784508d80dcc9bf6%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274519730327404&sdata=O7rOO971KGc2nUS3I6yAFlKDnmcTEJPIiLDmuAw%2B4Q0%3D&reserved=0
> 
> testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived  

[PROPOSAL] increase AcceptanceTest timeout

2020-06-11 Thread Kirk Lund
I think we need to bump up the max time for AcceptanceTest jobs.

Here's an example of an AcceptanceTest job hanging due to timeout in
archival (after the execution of tests already finished):
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/AcceptanceTestOpenJDK11/builds/253#A


Re: [PROPOSAL] backport GEODE-8238 to support/1.13

2020-06-11 Thread Eric Shu
+1

From: Owen Nichols 
Sent: Thursday, June 11, 2020 10:58 AM
To: dev@geode.apache.org 
Subject: Re: [PROPOSAL] backport GEODE-8238 to support/1.13

+1


---
Sent from Workspace ONE 
Boxer

On June 11, 2020 at 10:52:19 AM PDT, Bruce Schuchardt  wrote:
This ticket concerns loss of a message and a subsequent hang during shutdown.  
The problem is also in support/1.13 and I’d like to backport to that branch.

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8238&data=02%7C01%7Ceshu%40vmware.com%7Cf55f881a832d4e67547d08d80e310914%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274951060550176&sdata=yfBGDu3uNKwc11GxYw0r%2FPqIBxBu6m%2FsNyqnh2bEwMw%3D&reserved=0

It’s a small change to fix a problem introduced in changes for GEODE-7727.
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-7727&data=02%7C01%7Ceshu%40vmware.com%7Cf55f881a832d4e67547d08d80e310914%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274951060550176&sdata=2R3wCHRjlYMOz9kAX0kOB70byRgriHh1%2B5iF3ETW6TM%3D&reserved=0



RE: [PROPOSAL] backport GEODE-8238 to support/1.13

2020-06-11 Thread Dick Cavender
+1

-Original Message-
From: Eric Shu  
Sent: Thursday, June 11, 2020 11:33 AM
To: dev@geode.apache.org
Subject: Re: [PROPOSAL] backport GEODE-8238 to support/1.13

+1

From: Owen Nichols 
Sent: Thursday, June 11, 2020 10:58 AM
To: dev@geode.apache.org 
Subject: Re: [PROPOSAL] backport GEODE-8238 to support/1.13

+1


---
Sent from Workspace ONE 
Boxer

On June 11, 2020 at 10:52:19 AM PDT, Bruce Schuchardt  wrote:
This ticket concerns loss of a message and a subsequent hang during shutdown.  
The problem is also in support/1.13 and I'd like to backport to that branch.

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8238&data=02%7C01%7Cdickc%40vmware.com%7C449d3bc3f7fa465b8d7b08d80e36d985%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274976024080508&sdata=cG2rc2X1YAmsqGZVumT%2F4JLUYIH5hjm8Y%2Fw8ADIa6oQ%3D&reserved=0

It's a small change to fix a problem introduced in changes for GEODE-7727.
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-7727&data=02%7C01%7Cdickc%40vmware.com%7C449d3bc3f7fa465b8d7b08d80e36d985%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274976024085498&sdata=xaYlc41o44RmSdwvko%2FuhBW6VtNvRgFXGAsRbU%2FY2YQ%3D&reserved=0



Re: [PROPOSAL] increase AcceptanceTest timeout

2020-06-11 Thread Owen Nichols
This timeout is separate from the test timeout.

AcceptanceTest currently has a 90-minute timeout for running tests.
These tests usually take 60-65 minutes.

The timeout you mentioned is just for uploading the report.  It is currently 
set at 1 hour, although normally only takes about 30 seconds.  Sometimes 
network connections just hang, but increasing timeout will not help.  If this 
happens a lot, a better approach might be a shorter timeout and add a few 
retries for the report upload step. 

On 6/11/20, 11:29 AM, "Kirk Lund"  wrote:

I think we need to bump up the max time for AcceptanceTest jobs.

Here's an example of an AcceptanceTest job hanging due to timeout in
archival (after the execution of tests already finished):

https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main%2Fjobs%2FAcceptanceTestOpenJDK11%2Fbuilds%2F253%23A&data=02%7C01%7Conichols%40vmware.com%7C90e8ab7a6531401b29be08d80e356fcb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274969960758668&sdata=HXDYWMfcRGuOSdT%2FVjT%2FyZIGBoE3GzRijeUYs1ZATmU%3D&reserved=0



Re: [PROPOSAL] backport GEODE-8238 to support/1.13

2020-06-11 Thread Dave Barnes
OK, Bruce -- merge away.
Thanks,
Dave

On Thu, Jun 11, 2020 at 12:47 PM Dick Cavender  wrote:

> +1
>
> -Original Message-
> From: Eric Shu 
> Sent: Thursday, June 11, 2020 11:33 AM
> To: dev@geode.apache.org
> Subject: Re: [PROPOSAL] backport GEODE-8238 to support/1.13
>
> +1
> 
> From: Owen Nichols 
> Sent: Thursday, June 11, 2020 10:58 AM
> To: dev@geode.apache.org 
> Subject: Re: [PROPOSAL] backport GEODE-8238 to support/1.13
>
> +1
>
>
> ---
> Sent from Workspace ONE Boxer<
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwhatisworkspaceone.com%2Fboxer&data=02%7C01%7Cdickc%40vmware.com%7C449d3bc3f7fa465b8d7b08d80e36d985%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274976024080508&sdata=3hgzE2ImoJR8lePkXCM307ob%2FVB7m4dyEcbxLA75laI%3D&reserved=0
> >
>
> On June 11, 2020 at 10:52:19 AM PDT, Bruce Schuchardt 
> wrote:
> This ticket concerns loss of a message and a subsequent hang during
> shutdown.  The problem is also in support/1.13 and I'd like to backport to
> that branch.
>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-8238&data=02%7C01%7Cdickc%40vmware.com%7C449d3bc3f7fa465b8d7b08d80e36d985%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274976024080508&sdata=cG2rc2X1YAmsqGZVumT%2F4JLUYIH5hjm8Y%2Fw8ADIa6oQ%3D&reserved=0
>
> It's a small change to fix a problem introduced in changes for GEODE-7727.
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-7727&data=02%7C01%7Cdickc%40vmware.com%7C449d3bc3f7fa465b8d7b08d80e36d985%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274976024085498&sdata=xaYlc41o44RmSdwvko%2FuhBW6VtNvRgFXGAsRbU%2FY2YQ%3D&reserved=0
>
>


Re: [PROPOSAL] increase AcceptanceTest timeout

2020-06-11 Thread Kirk Lund
Ok, let's just keep an eye on it for now. Thanks!

On Thu, Jun 11, 2020 at 12:53 PM Owen Nichols  wrote:

> This timeout is separate from the test timeout.
>
> AcceptanceTest currently has a 90-minute timeout for running tests.
> These tests usually take 60-65 minutes.
>
> The timeout you mentioned is just for uploading the report.  It is
> currently set at 1 hour, although normally only takes about 30 seconds.
> Sometimes network connections just hang, but increasing timeout will not
> help.  If this happens a lot, a better approach might be a shorter timeout
> and add a few retries for the report upload step.
>
> On 6/11/20, 11:29 AM, "Kirk Lund"  wrote:
>
> I think we need to bump up the max time for AcceptanceTest jobs.
>
> Here's an example of an AcceptanceTest job hanging due to timeout in
> archival (after the execution of tests already finished):
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fconcourse.apachegeode-ci.info%2Fteams%2Fmain%2Fpipelines%2Fapache-develop-main%2Fjobs%2FAcceptanceTestOpenJDK11%2Fbuilds%2F253%23A&data=02%7C01%7Conichols%40vmware.com%7C90e8ab7a6531401b29be08d80e356fcb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637274969960758668&sdata=HXDYWMfcRGuOSdT%2FVjT%2FyZIGBoE3GzRijeUYs1ZATmU%3D&reserved=0
>
>


Re: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

2020-06-11 Thread Michael Oleske
> Suppose I want to commit to another contributor's fork. How can they grant
> me permission to do so? (This is a common predicament for me when I'm
> reviewing doc PRs.)

Just as a note: if there is a PR and it is from a fork, the person who opens 
the PR can check a box that says "Allow contributors to commit to the PR" (or 
something like that).  That box allows anyone who is committer to commit to 
that PR regardless if it is fork.

-michael

From: Jianxia Chen 
Sent: Wednesday, June 3, 2020 17:27
To: dev@geode.apache.org 
Subject: Re: [DISCUSSION] Stop using the Geode Repository for Feature/WIP 
Branches

The contributor has to add you as a collaborator of the contributor’s fork.

On Wed, Jun 3, 2020 at 5:05 PM Dave Barnes  wrote:

> Suppose I want to commit to another contributor's fork. How can they grant
> me permission to do so? (This is a common predicament for me when I'm
> reviewing doc PRs.)
>
>
> On Wed, Jun 3, 2020 at 2:04 PM Xiaojian Zhou  wrote:
>
> > We have discussed that when in Common team. The current solution worked
> > perfectly.
> >
> > One person will merge the develop into feature/GEODE-7665 (which
> > conceptually can be anyone. I did 2 times) every week. Now Naba is taking
> > the responsibility to do the weekly merge. He did great!
> >
> > Fork will cause many other issues, it will still need a person to
> maintain
> > it. I feel fork is only suitable for a work that will be finished within
> a
> > week.
> >
> > Regards
> > Gester
> >
> > On 6/2/20, 4:41 PM, "Nabarun Nag"  wrote:
> >
> > I don’t think it is right to make the open source Geode Community to
> > work on my personal fork
> >
> > Regards
> > Naba
> >
> >
> > -Original Message-
> > From: Mark Hanson 
> > Sent: Tuesday, June 2, 2020 4:35 PM
> > To: dev@geode.apache.org
> > Subject: Re: [DISCUSSION] Stop using the Geode Repository for
> > Feature/WIP Branches
> >
> > While I am not 100% sure, I understand your thoughts here, I am
> pretty
> > sure I do. We have already done such work in a branch in a fork
> (Micrometer
> > work). The only real gotcha was that there needed to be one person at
> least
> > as a collaborator, in case of vacations and such.
> >
> > All of the things you have specified are possible within the confines
> > of a fork.
> >
> > Thanks,
> > Mark
> >
> > On 6/2/20, 4:29 PM, "Nabarun Nag"  wrote:
> >
> > - We are maintaining feature/GEODE-7665 which is the feature
> > branch for PR clear work on which multiple developers are working on.
> > - We are maintaining this in Geode repository.
> > - All sub-tasks of GEODE-7665 are merged into this feature
> branch.
> > - Anyone in the Geode community can work on any subtask
> > - This is a long running, and a massive feature development which
> > is manipulating core code on Apache Geode. Hence all work is pushed to
> the
> > feature branch to keep develop isolated from a regression introduced in
> PR
> > clear work.
> > - We have previously used release flags for Lucene work which we
> > found to be inefficient and unnecessary extra work.
> >
> > We vote that PR clear feature branch be maintained in the Geode
> > Repository as this is a long running, massive effort involving everyone
> > from the community.
> >
> > When the PR clear tasks are completed, the branch will be
> > rigorously tested and then squash merged into develop and the feature
> > branch will be deleted.
> >
> >
> > Regards
> > Naba
> >
> > -Original Message-
> > From: Jacob Barrett 
> > Sent: Tuesday, June 2, 2020 3:43 PM
> > To: dev@geode.apache.org
> > Subject: [DISCUSSION] Stop using the Geode Repository for
> > Feature/WIP Branches
> >
> > I know this has been brought up multiple times without
> resolution.
> > I want us resolve to ban the use of Geode repository for work in
> progress,
> > feature branches, or any other branches that are not release or support
> > branches. There is no reason given the nature of GitHub why you can’t
> fork
> > the repository to contribute.
> >
> > * Work done on these branches results in the ASF bots updating
> the
> > associated JIRAs and email blasting all of us with your work.
> >
> > * People don’t clean up these branches, which leads to a mess of
> > branches on everyones clones and in the UI.
> >
> > * All your intermediate commits get synced to the repo, which
> > bloats the repo for everyone else. Even your commits you rebase over and
> > force push are left in the repo. When you delete your branch these
> commits
> > are not removed. There is no way for us to prune unreferenced commits.
> > Nobody else needs your commits outside of what was merged to a production
> > branch.
> >
> > If anyone has a use case for working directly from Geode repo
> that
> > can’t work from a fork plea

Re: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

2020-06-11 Thread Jinmei Liao
+1 for ban the use of geode repo for feature branches. Less clutter is better.

From: Jacob Barrett 
Sent: Tuesday, June 2, 2020 3:42 PM
To: dev@geode.apache.org 
Subject: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

I know this has been brought up multiple times without resolution. I want us 
resolve to ban the use of Geode repository for work in progress, feature 
branches, or any other branches that are not release or support branches. There 
is no reason given the nature of GitHub why you can’t fork the repository to 
contribute.

* Work done on these branches results in the ASF bots updating the associated 
JIRAs and email blasting all of us with your work.

* People don’t clean up these branches, which leads to a mess of branches on 
everyones clones and in the UI.

* All your intermediate commits get synced to the repo, which bloats the repo 
for everyone else. Even your commits you rebase over and force push are left in 
the repo. When you delete your branch these commits are not removed. There is 
no way for us to prune unreferenced commits. Nobody else needs your commits 
outside of what was merged to a production branch.

If anyone has a use case for working directly from Geode repo that can’t work 
from a fork please post it here so we can resolve.

Thanks,
Jake





Re: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

2020-06-11 Thread Owen Nichols
-1 for “banning”

The Geode community has strong precedent for protecting developers’ freedom to 
choose what works best for them.

Maybe check if your graphical git tool-of-choice allows a branch filter regex, 
or consider git clone --single-branch?


-Owen

On June 11, 2020 at 5:23:20 PM PDT, Jinmei Liao  wrote:
+1 for ban the use of geode repo for feature branches. Less clutter is better.

From: Jacob Barrett 
Sent: Tuesday, June 2, 2020 3:42 PM
To: dev@geode.apache.org 
Subject: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

I know this has been brought up multiple times without resolution. I want us 
resolve to ban the use of Geode repository for work in progress, feature 
branches, or any other branches that are not release or support branches. There 
is no reason given the nature of GitHub why you can’t fork the repository to 
contribute.

* Work done on these branches results in the ASF bots updating the associated 
JIRAs and email blasting all of us with your work.

* People don’t clean up these branches, which leads to a mess of branches on 
everyones clones and in the UI.

* All your intermediate commits get synced to the repo, which bloats the repo 
for everyone else. Even your commits you rebase over and force push are left in 
the repo. When you delete your branch these commits are not removed. There is 
no way for us to prune unreferenced commits. Nobody else needs your commits 
outside of what was merged to a production branch.

If anyone has a use case for working directly from Geode repo that can’t work 
from a fork please post it here so we can resolve.

Thanks,
Jake





Re: [DISCUSSION] Stop using the Geode Repository for Feature/WIP Branches

2020-06-11 Thread Jacob Barrett


> On Jun 11, 2020, at 8:41 PM, Owen Nichols  wrote:
> 
> -1 for “banning”

:facepalm: Sorry, poor choice of words. How about “strongly discouraged” 
because of the harm enumerated perviously. 

> 
> The Geode community has strong precedent for protecting developers’ freedom 
> to choose what works best for them.

Not to the detriment of others in the community.

> Maybe check if your graphical git tool-of-choice allows a branch filter 
> regex, or consider git clone --single-branch?

Whether someone filters or not the issues enumerated exist even if hidden away.

-Jake