RFC - Client side configuration for a SNI proxy

2020-03-06 Thread Bill Burcham
Please review the RFC for *Client side configuration for a SNI proxy*:

https://cwiki.apache.org/confluence/display/GEODE/Client+side+configuration+for+a+SNI+proxy

Please comment by Monday, March 9, 2020.

Regards,
Bill and Ernie


Re: RFC - Client side configuration for a SNI proxy

2020-03-06 Thread Udo Kohlmeyer

Hi there Bill,

Whilst I commend your enthusiasm. Giving the community a weekend to 
review an RFC is less than optimal.


Please extend you deadline until 13 March 2020. Which is more reasonable.

--Udo

On 3/6/20 11:04 AM, Bill Burcham wrote:

Please review the RFC for *Client side configuration for a SNI proxy*:

https://cwiki.apache.org/confluence/display/GEODE/Client+side+configuration+for+a+SNI+proxy

Please comment by Monday, March 9, 2020.

Regards,
Bill and Ernie



Re: [PROPOSAL] eliminate file count loophole in PR StressNewTest

2020-03-06 Thread Donal Evans
With fairly apt timing, we've recently had a PR merged (
https://github.com/apache/geode/pull/4745) that introduced a test that has
resulted in fairly consistently flaky failures in the pipeline (JIRA
ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
The PR was quite large and touched/created a lot of tests, so StressNewTest
never ran on it:
https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given how
frequently the test is failing in the pipeline (11 failures on various
IntegrationTest jobs over the past 6 commits), it seems very likely that
had StressNewTest been allowed to run, this issue would have been caught at
the PR stage and could have been remedied then, instead of resulting in a
nearly constantly red pipeline.

I've already given my +1 to this proposal, but this situation has prompted
me to make it a *+1*.

On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade 
wrote:

> The stress test is to identify the flaky-ness within the test; and assuming
> any changes to the test may have introduced the flaky-ness.
> It's about paying the cost upfront or later when the test is determined to
> be flaky.
> If 25+ tests have been changed in a PR, the cost of running stress test for
> those;  and gating the PR for so long.
> Knowing how much pain it causes to fix a flaky test after a certain/long
> duration of time; I am +1 for doing this change.
>
> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith  wrote:
>
> > What's the current timeout for StressNewTest? Maybe if we just up the
> > threshold to 100 tests or so and up the timeout to match we can catch
> > pretty much all PRs.
> >
> > I'm not sure why the job is flagging more tests than it should. It looks
> > like at some point @rhoughon changed it to read the merge base from some
> > file created by concourse as an optimization [1] - I suspect maybe that
> > file is inaccurate?
> >
> > I originally wrote this job. It's definitely not a panacea, it will only
> > catch a new flaky test if
> >  - the test is really flaky (likely to fail more than 1/50 times)
> >  - the change actually happened in the test file itself, and not the
> > product or some other test file.
> >
> > [1]
> >
> >
> https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87
> >
> > -Dan
> >
> > On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols  wrote:
> >
> > > We don’t tend to look too closely at successful PR checks to see
> whether
> > > they actually checked anything at all.
> > >
> > > One example I found is
> > >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
> > > <
> > >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
> > > >:
> > > 32 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.
> > >
> > > Here are 92 more examples (url’s omitted for brevity — use the example
> > > above as a template and just replace the last 4 digits):
> > > 26 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6243)
> > > 26 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6249)
> > > 26 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6402)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6262)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6430)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6439)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6449)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6454)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6458)
> > > 27 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6459)
> > > 28 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6224)
> > > 28 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6441)
> > > 28 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6448)
> > > 28 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6452)
> > > 29 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6102)
> > > 29 is too many changed tests to stress test. Allowing this job to pass
> > > without stress testing.  (build 6177)
> > > 30 is too m

Re: [PROPOSAL] eliminate file count loophole in PR StressNewTest

2020-03-06 Thread Nabarun Nag
+1

On Fri, Mar 6, 2020 at 6:33 PM Donal Evans  wrote:

> With fairly apt timing, we've recently had a PR merged (
> https://github.com/apache/geode/pull/4745) that introduced a test that has
> resulted in fairly consistently flaky failures in the pipeline (JIRA
> ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
> The PR was quite large and touched/created a lot of tests, so StressNewTest
> never ran on it:
> https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given how
> frequently the test is failing in the pipeline (11 failures on various
> IntegrationTest jobs over the past 6 commits), it seems very likely that
> had StressNewTest been allowed to run, this issue would have been caught at
> the PR stage and could have been remedied then, instead of resulting in a
> nearly constantly red pipeline.
>
> I've already given my +1 to this proposal, but this situation has prompted
> me to make it a *+1*.
>
> On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade 
> wrote:
>
> > The stress test is to identify the flaky-ness within the test; and
> assuming
> > any changes to the test may have introduced the flaky-ness.
> > It's about paying the cost upfront or later when the test is determined
> to
> > be flaky.
> > If 25+ tests have been changed in a PR, the cost of running stress test
> for
> > those;  and gating the PR for so long.
> > Knowing how much pain it causes to fix a flaky test after a certain/long
> > duration of time; I am +1 for doing this change.
> >
> > On Tue, Mar 3, 2020 at 10:06 AM Dan Smith  wrote:
> >
> > > What's the current timeout for StressNewTest? Maybe if we just up the
> > > threshold to 100 tests or so and up the timeout to match we can catch
> > > pretty much all PRs.
> > >
> > > I'm not sure why the job is flagging more tests than it should. It
> looks
> > > like at some point @rhoughon changed it to read the merge base from
> some
> > > file created by concourse as an optimization [1] - I suspect maybe that
> > > file is inaccurate?
> > >
> > > I originally wrote this job. It's definitely not a panacea, it will
> only
> > > catch a new flaky test if
> > >  - the test is really flaky (likely to fail more than 1/50 times)
> > >  - the change actually happened in the test file itself, and not the
> > > product or some other test file.
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87
> > >
> > > -Dan
> > >
> > > On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols 
> wrote:
> > >
> > > > We don’t tend to look too closely at successful PR checks to see
> > whether
> > > > they actually checked anything at all.
> > > >
> > > > One example I found is
> > > >
> > >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
> > > > <
> > > >
> > >
> >
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
> > > > >:
> > > > 32 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.
> > > >
> > > > Here are 92 more examples (url’s omitted for brevity — use the
> example
> > > > above as a template and just replace the last 4 digits):
> > > > 26 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6243)
> > > > 26 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6249)
> > > > 26 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6402)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6262)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6430)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6439)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6449)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6454)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6458)
> > > > 27 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6459)
> > > > 28 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6224)
> > > > 28 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6441)
> > > > 28 is too many changed tests to stress test. Allowing this job to
> pass
> > > > without stress testing.  (build 6448)
> > > > 28 is too many changed tests to stress test. Allowing this job to
> pass

Re: [PROPOSAL] eliminate file count loophole in PR StressNewTest

2020-03-06 Thread Donal Evans
With fairly apt timing, we've recently had a PR merged (
https://github.com/apache/geode/pull/4745) that introduced a test that has
resulted in fairly consistently flaky failures in the pipeline (JIRA
ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
The PR was quite large and touched/created a lot of tests, so StressNewTest
never ran on it:
https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given how
frequently the test is failing in the pipeline (11 failures on various
IntegrationTest jobs over the past 6 commits), it seems very likely that
had StressNewTest been allowed to run, this issue would have been caught at
the PR stage and could have been remedied then, instead of resulting in a
nearly constantly red pipeline.

I've already given my +1 to this proposal, but this situation has prompted
me to make it a +2.

On Fri, Mar 6, 2020 at 1:46 PM Donal Evans  wrote:

> With fairly apt timing, we've recently had a PR merged (
> https://github.com/apache/geode/pull/4745) that introduced a test that
> has resulted in fairly consistently flaky failures in the pipeline (JIRA
> ticket: https://issues.apache.org/jira/projects/GEODE/issues/GEODE-7843).
> The PR was quite large and touched/created a lot of tests, so StressNewTest
> never ran on it:
> https://concourse.apachegeode-ci.info/builds/136784#L5e3ac3fc:2. Given
> how frequently the test is failing in the pipeline (11 failures on various
> IntegrationTest jobs over the past 6 commits), it seems very likely that
> had StressNewTest been allowed to run, this issue would have been caught at
> the PR stage and could have been remedied then, instead of resulting in a
> nearly constantly red pipeline.
>
> I've already given my +1 to this proposal, but this situation has prompted
> me to make it a *+1*.
>
> On Tue, Mar 3, 2020 at 2:08 PM Anilkumar Gingade 
> wrote:
>
>> The stress test is to identify the flaky-ness within the test; and
>> assuming
>> any changes to the test may have introduced the flaky-ness.
>> It's about paying the cost upfront or later when the test is determined to
>> be flaky.
>> If 25+ tests have been changed in a PR, the cost of running stress test
>> for
>> those;  and gating the PR for so long.
>> Knowing how much pain it causes to fix a flaky test after a certain/long
>> duration of time; I am +1 for doing this change.
>>
>> On Tue, Mar 3, 2020 at 10:06 AM Dan Smith  wrote:
>>
>> > What's the current timeout for StressNewTest? Maybe if we just up the
>> > threshold to 100 tests or so and up the timeout to match we can catch
>> > pretty much all PRs.
>> >
>> > I'm not sure why the job is flagging more tests than it should. It looks
>> > like at some point @rhoughon changed it to read the merge base from some
>> > file created by concourse as an optimization [1] - I suspect maybe that
>> > file is inaccurate?
>> >
>> > I originally wrote this job. It's definitely not a panacea, it will only
>> > catch a new flaky test if
>> >  - the test is really flaky (likely to fail more than 1/50 times)
>> >  - the change actually happened in the test file itself, and not the
>> > product or some other test file.
>> >
>> > [1]
>> >
>> >
>> https://github.com/apache/geode/commit/4c06ba4625e69d44a5165aa9f2fccddfc064de87
>> >
>> > -Dan
>> >
>> > On Sun, Mar 1, 2020 at 9:00 PM Owen Nichols 
>> wrote:
>> >
>> > > We don’t tend to look too closely at successful PR checks to see
>> whether
>> > > they actually checked anything at all.
>> > >
>> > > One example I found is
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
>> > > <
>> > >
>> >
>> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-pr/jobs/StressNewTestOpenJDK11/builds/5957
>> > > >:
>> > > 32 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.
>> > >
>> > > Here are 92 more examples (url’s omitted for brevity — use the example
>> > > above as a template and just replace the last 4 digits):
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6243)
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6249)
>> > > 26 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6402)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6262)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6430)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6439)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass
>> > > without stress testing.  (build 6449)
>> > > 27 is too many changed tests to stress test. Allowing this job to pass