[PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Ju@N
Hello devs,

I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
The change avoid unnecessary transformations between java collections and
primitive arrays for every message sent within a Geode cluster (see the
Geode tickets for further details).
The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
My apologies for keep asking to merge changes into an already cut release,
we found a performance degradation (around 7%) within our internal testings
and we're trying to solve it step by step before version 1.12 is released
(we are 3% away from the baseline, Geode 1.10, right now).
Best regards.

[1]: https://issues.apache.org/jira/browse/GEODE-7820
[2]:
https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769

-- 
Ju@N


Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Udo Kohlmeyer

+1, Keep the improvements coming...

On 2/28/20 3:48 AM, Ju@N wrote:

Hello devs,

I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
The change avoid unnecessary transformations between java collections and
primitive arrays for every message sent within a Geode cluster (see the
Geode tickets for further details).
The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
My apologies for keep asking to merge changes into an already cut release,
we found a performance degradation (around 7%) within our internal testings
and we're trying to solve it step by step before version 1.12 is released
(we are 3% away from the baseline, Geode 1.10, right now).
Best regards.

[1]: https://issues.apache.org/jira/browse/GEODE-7820
[2]:
https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769



Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Owen Nichols
+1

On Fri, Feb 28, 2020 at 7:38 AM Udo Kohlmeyer  wrote:

> +1, Keep the improvements coming...
>
> On 2/28/20 3:48 AM, Ju@N wrote:
> > Hello devs,
> >
> > I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
> > The change avoid unnecessary transformations between java collections and
> > primitive arrays for every message sent within a Geode cluster (see the
> > Geode tickets for further details).
> > The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
> > My apologies for keep asking to merge changes into an already cut
> release,
> > we found a performance degradation (around 7%) within our internal
> testings
> > and we're trying to solve it step by step before version 1.12 is released
> > (we are 3% away from the baseline, Geode 1.10, right now).
> > Best regards.
> >
> > [1]: https://issues.apache.org/jira/browse/GEODE-7820
> > [2]:
> >
> https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769
> >
>


Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Donal Evans
+1

On Fri, Feb 28, 2020 at 7:40 AM Owen Nichols  wrote:

> +1
>
> On Fri, Feb 28, 2020 at 7:38 AM Udo Kohlmeyer  wrote:
>
> > +1, Keep the improvements coming...
> >
> > On 2/28/20 3:48 AM, Ju@N wrote:
> > > Hello devs,
> > >
> > > I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
> > > The change avoid unnecessary transformations between java collections
> and
> > > primitive arrays for every message sent within a Geode cluster (see the
> > > Geode tickets for further details).
> > > The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
> > > My apologies for keep asking to merge changes into an already cut
> > release,
> > > we found a performance degradation (around 7%) within our internal
> > testings
> > > and we're trying to solve it step by step before version 1.12 is
> released
> > > (we are 3% away from the baseline, Geode 1.10, right now).
> > > Best regards.
> > >
> > > [1]: https://issues.apache.org/jira/browse/GEODE-7820
> > > [2]:
> > >
> >
> https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769
> > >
> >
>


Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Ernest Burghardt
There appears to be consensus that this is a critical fix.

The following commit has been brought into release/1.12 as the
critical fix for GEODE-7820:

git cherry-pick -x ca7ccbce73d436005fe027f31ee910ee9beeb769

GEODE-7820 has been marked as 'resolved in' 1.12.

Regards
EB


On Fri, Feb 28, 2020 at 7:51 AM Donal Evans  wrote:

> +1
>
> On Fri, Feb 28, 2020 at 7:40 AM Owen Nichols  wrote:
>
> > +1
> >
> > On Fri, Feb 28, 2020 at 7:38 AM Udo Kohlmeyer  wrote:
> >
> > > +1, Keep the improvements coming...
> > >
> > > On 2/28/20 3:48 AM, Ju@N wrote:
> > > > Hello devs,
> > > >
> > > > I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
> > > > The change avoid unnecessary transformations between java collections
> > and
> > > > primitive arrays for every message sent within a Geode cluster (see
> the
> > > > Geode tickets for further details).
> > > > The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
> > > > My apologies for keep asking to merge changes into an already cut
> > > release,
> > > > we found a performance degradation (around 7%) within our internal
> > > testings
> > > > and we're trying to solve it step by step before version 1.12 is
> > released
> > > > (we are 3% away from the baseline, Geode 1.10, right now).
> > > > Best regards.
> > > >
> > > > [1]: https://issues.apache.org/jira/browse/GEODE-7820
> > > > [2]:
> > > >
> > >
> >
> https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769
> > > >
> > >
> >
>


Re: Discussions about concerns over User API changes

2020-02-28 Thread Anthony Baker
If I run the japi-compliance-checker [1] against the 1.12 release branch and 
develop this change pops up as a binary incompatibility.  As Owen noted, this 
would require recompilation to avoid NoSuchMethod errors.

One effect this could have is that a library built on top of Geode (e.g. 
Spring) would not be compatible with both 1.12 and 1.13.

Anthony


[1] https://lvc.github.io/japi-compliance-checker/ 



> On Feb 27, 2020, at 5:19 PM, Owen Nichols  wrote:
> 
> While changing a void method to have a return type does not break source 
> compatibility, it appears likely to break binary compatibility[1].
> 
> So, if you are compiling your client from source, it will compile 
> successfully against either Geode 1.12 or Geode 1.13.  But if your client was 
> already compiled [against Geode 1.12] and then you upgrade to Geode 1.13, 
> without recompiling your client, I your client will throw 
> MethodNotFoundException[2].
> 
> [1] 
> https://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_interfaces_-_API_methods
> [2] 
> https://stackoverflow.com/questions/47476509/can-i-change-a-return-type-to-be-a-strict-subtype-and-retain-binary-compatibilit
> 
>> On Feb 27, 2020, at 5:09 PM, Kirk Lund  wrote:
>> 
>> Remember, if there are any concerns about recent backwards-compatible
>> changes to Geode user APIs, they should be brought on the dev list.
>> 
>> Also, backward-compatible changes are by definition safe and ok for a user
>> API because it won't break the user's code.
>> 
>> Here's an example of a user API that I recently fixed...
>> 
>> The ClientRegionFactory is a builder with methods that are supposed to
>> follow fluent-style API (ie, return the ClientRegionFactory instance so you
>> can chain the calls).
>> 
>> Whoever added setConcurrencyChecksEnabled goofed up and made the return
>> type void. Changing void to ClientRegionFactory is a fully backwards
>> compatible change which corrects the API. Since this fixes the API and
>> doesn't actually change the user API, this should be very safe and improves
>> Geode by correcting a broken API:
>> 
>> *diff --git
>> a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java
>> b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
>> 
>> *index add35f01a2..2a08307adc 100644*
>> 
>> *---
>> a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
>> 
>> *+++
>> b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
>> 
>> @@ -216,7 +216,7 @@ public interface ClientRegionFactory {
>> 
>>   * @since GemFire 7.0
>> 
>>   * @param concurrencyChecksEnabled whether to perform concurrency checks
>> on operations
>> 
>>   */
>> 
>> -  void setConcurrencyChecksEnabled(boolean concurrencyChecksEnabled);
>> 
>> +  ClientRegionFactory setConcurrencyChecksEnabled(boolean
>> concurrencyChecksEnabled);
>> 
>> 
>> 
>>  /**
>> 
>>   * Sets the DiskStore name attribute. This causes the region to belong
>> to the DiskStore.
>> 
>> *diff --git
>> a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java
>> b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
>> 
>> *index 64256e8f8e..920deba055 100644*
>> 
>> *---
>> a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
>> 
>> *+++
>> b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
>> 
>> @@ -186,8 +186,9 @@ public class ClientRegionFactoryImpl implements
>> ClientRegionFactory
>> 
>>  }
>> 
>> 
>> 
>>  @Override
>> 
>> -  public void setConcurrencyChecksEnabled(boolean
>> concurrencyChecksEnabled) {
>> 
>> +  public ClientRegionFactory setConcurrencyChecksEnabled(boolean
>> concurrencyChecksEnabled) {
>> 
>> 
>> this.attrsFactory.setConcurrencyChecksEnabled(concurrencyChecksEnabled);
>> 
>> +return this;
>> 
>>  }
>> 
>> 
>> 
>>  @Override
>> 
>> Does anyone have concerns over fixing "void setConcurrencyChecksEnabled"
>> by changing it to "ClientRegionFactory setConcurrencyChecksEnabled"? If
>> there is a concern, is the concern about the change itself or because this
>> was fixed without following a more heavy-weight process?
>> 
>> Thanks,
>> Kirk
> 



Re: Discussions about concerns over User API changes

2020-02-28 Thread Udo Kohlmeyer
Another affect is code deployment onto/into the server, which 
could/would reference a change (binary) API. Users generally don't 
recompile the code they redeploy. The NoSuchMethod is now harder to 
track down.


--Udo


On 2/28/20 8:59 AM, Anthony Baker wrote:

If I run the japi-compliance-checker [1] against the 1.12 release branch and 
develop this change pops up as a binary incompatibility.  As Owen noted, this 
would require recompilation to avoid NoSuchMethod errors.

One effect this could have is that a library built on top of Geode (e.g. 
Spring) would not be compatible with both 1.12 and 1.13.

Anthony


[1] https://lvc.github.io/japi-compliance-checker/ 




On Feb 27, 2020, at 5:19 PM, Owen Nichols  wrote:

While changing a void method to have a return type does not break source 
compatibility, it appears likely to break binary compatibility[1].

So, if you are compiling your client from source, it will compile successfully 
against either Geode 1.12 or Geode 1.13.  But if your client was already 
compiled [against Geode 1.12] and then you upgrade to Geode 1.13, without 
recompiling your client, I your client will throw MethodNotFoundException[2].

[1] 
https://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_interfaces_-_API_methods
[2] 
https://stackoverflow.com/questions/47476509/can-i-change-a-return-type-to-be-a-strict-subtype-and-retain-binary-compatibilit


On Feb 27, 2020, at 5:09 PM, Kirk Lund  wrote:

Remember, if there are any concerns about recent backwards-compatible
changes to Geode user APIs, they should be brought on the dev list.

Also, backward-compatible changes are by definition safe and ok for a user
API because it won't break the user's code.

Here's an example of a user API that I recently fixed...

The ClientRegionFactory is a builder with methods that are supposed to
follow fluent-style API (ie, return the ClientRegionFactory instance so you
can chain the calls).

Whoever added setConcurrencyChecksEnabled goofed up and made the return
type void. Changing void to ClientRegionFactory is a fully backwards
compatible change which corrects the API. Since this fixes the API and
doesn't actually change the user API, this should be very safe and improves
Geode by correcting a broken API:

*diff --git
a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java
b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*

*index add35f01a2..2a08307adc 100644*

*---
a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*

*+++
b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*

@@ -216,7 +216,7 @@ public interface ClientRegionFactory {

   * @since GemFire 7.0

   * @param concurrencyChecksEnabled whether to perform concurrency checks
on operations

   */

-  void setConcurrencyChecksEnabled(boolean concurrencyChecksEnabled);

+  ClientRegionFactory setConcurrencyChecksEnabled(boolean
concurrencyChecksEnabled);



  /**

   * Sets the DiskStore name attribute. This causes the region to belong
to the DiskStore.

*diff --git
a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java
b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*

*index 64256e8f8e..920deba055 100644*

*---
a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*

*+++
b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*

@@ -186,8 +186,9 @@ public class ClientRegionFactoryImpl implements
ClientRegionFactory

  }



  @Override

-  public void setConcurrencyChecksEnabled(boolean
concurrencyChecksEnabled) {

+  public ClientRegionFactory setConcurrencyChecksEnabled(boolean
concurrencyChecksEnabled) {


this.attrsFactory.setConcurrencyChecksEnabled(concurrencyChecksEnabled);

+return this;

  }



  @Override

Does anyone have concerns over fixing "void setConcurrencyChecksEnabled"
by changing it to "ClientRegionFactory setConcurrencyChecksEnabled"? If
there is a concern, is the concern about the change itself or because this
was fixed without following a more heavy-weight process?

Thanks,
Kirk




[PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Hi All,

Proposal: Force StressNewTest to fail a change with 25 or more files rather 
than pass it without running it.

Currently, the StressNewTest job in the pipeline will just pass a job that has 
more than 25 files changed. It will be marked as green with no work done. There 
are reasons, relating to run time being too long to be tracked by concourse, so 
we just let it through as a pass. I think this is a bad signal. I think that 
this should automatically be a failure in the short term. As a result, I also 
think it should not be required. It is a bad signal, and I think that by making 
it a fail, this will at least not give us a false sense of security. I 
understand that this opens the flood gates so to speak, but I don’t think as a 
community it is a big problem because we can still say that you should not 
merge if the StressNewTest fails because of your test.

I personally find the false sense of security more troubling than anything. 
Hence the reason, I would like this to be

Let me know what you think..

Thanks,
Mark

Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Owen Nichols
-1 to making StressNew not required

+1 to eliminating the current loophole — StressNew should never give a free 
pass.  

Any time your PR is having trouble passing StressNew, please bring it up on the 
dev list. We can review on a case-by-case basis and decide whether to try 
increasing the timeout, changing the repeat count, refactoring the PR, or as an 
absolute last resort requesting authorization for an override (for example, a 
change to spotless rules might touch a huge number of files but carry no risk).

One bug we should fix is that StressNew sometimes counts more files touched 
than really were, especially if you had many commits or merges or rebases on 
your PR branch.  Possible workarounds there include squashing and/or creating a 
new PR and/or splitting into multiple PRs.  I’ve spent some time trying to 
reproduce why files are mis-counted, with no success, but perhaps someone 
cleverer with git could provide a fix.

Another issue is that StressNew is only in the PR pipeline, not the main 
develop pipeline.  This feels like an asymmetry where PRs must pass a “higher” 
standard.  We should consider adding some form of StressNew to the main 
pipeline as well (maybe compare to the previous SHA that passed?).

The original motivation for the 25-file limit was an attempt to limit how long 
StressNew might run for.  Since concourse already applies a timeout, that check 
is unnecessary.  However, a compromise solution might be to use the number of 
files changed to dial back the number of repeats, e.g. stay with 50 repeats if 
fewer than 25 files changed, otherwise compute 1250 / <#-files-changed> and do 
only that many repeats (e.g. if 50 files changed, run all tests 25x instead of 
50x).

While StressNew is intended to catch new flaky tests, it can also catch 
poorly-designed tests that fail just by running twice in the same VM.  This may 
be a sign that the test does not clean up properly and could be polluting other 
tests in unexpected ways?  It might be useful to run a “StressNew” with 
repeats=2 over a much broader scope, maybe even all tests, at least once in a 
while?


> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> 
> Hi All,
> 
> Proposal: Force StressNewTest to fail a change with 25 or more files rather 
> than pass it without running it.
> 
> Currently, the StressNewTest job in the pipeline will just pass a job that 
> has more than 25 files changed. It will be marked as green with no work done. 
> There are reasons, relating to run time being too long to be tracked by 
> concourse, so we just let it through as a pass. I think this is a bad signal. 
> I think that this should automatically be a failure in the short term. As a 
> result, I also think it should not be required. It is a bad signal, and I 
> think that by making it a fail, this will at least not give us a false sense 
> of security. I understand that this opens the flood gates so to speak, but I 
> don’t think as a community it is a big problem because we can still say that 
> you should not merge if the StressNewTest fails because of your test.
> 
> I personally find the false sense of security more troubling than anything. 
> Hence the reason, I would like this to be
> 
> Let me know what you think..
> 
> Thanks,
> Mark



Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Ju@N
+1 to what Owen said, I don't think disabling *StressNewTest* is a
good idea.

On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:

> -1 to making StressNew not required
>
> +1 to eliminating the current loophole — StressNew should never give a
> free pass.
>
> Any time your PR is having trouble passing StressNew, please bring it up
> on the dev list. We can review on a case-by-case basis and decide whether
> to try increasing the timeout, changing the repeat count, refactoring the
> PR, or as an absolute last resort requesting authorization for an override
> (for example, a change to spotless rules might touch a huge number of files
> but carry no risk).
>
> One bug we should fix is that StressNew sometimes counts more files
> touched than really were, especially if you had many commits or merges or
> rebases on your PR branch.  Possible workarounds there include squashing
> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
> some time trying to reproduce why files are mis-counted, with no success,
> but perhaps someone cleverer with git could provide a fix.
>
> Another issue is that StressNew is only in the PR pipeline, not the main
> develop pipeline.  This feels like an asymmetry where PRs must pass a
> “higher” standard.  We should consider adding some form of StressNew to the
> main pipeline as well (maybe compare to the previous SHA that passed?).
>
> The original motivation for the 25-file limit was an attempt to limit how
> long StressNew might run for.  Since concourse already applies a timeout,
> that check is unnecessary.  However, a compromise solution might be to use
> the number of files changed to dial back the number of repeats, e.g. stay
> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
> <#-files-changed> and do only that many repeats (e.g. if 50 files changed,
> run all tests 25x instead of 50x).
>
> While StressNew is intended to catch new flaky tests, it can also catch
> poorly-designed tests that fail just by running twice in the same VM.  This
> may be a sign that the test does not clean up properly and could be
> polluting other tests in unexpected ways?  It might be useful to run a
> “StressNew” with repeats=2 over a much broader scope, maybe even all tests,
> at least once in a while?
>
>
> > On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > Proposal: Force StressNewTest to fail a change with 25 or more files
> rather than pass it without running it.
> >
> > Currently, the StressNewTest job in the pipeline will just pass a job
> that has more than 25 files changed. It will be marked as green with no
> work done. There are reasons, relating to run time being too long to be
> tracked by concourse, so we just let it through as a pass. I think this is
> a bad signal. I think that this should automatically be a failure in the
> short term. As a result, I also think it should not be required. It is a
> bad signal, and I think that by making it a fail, this will at least not
> give us a false sense of security. I understand that this opens the flood
> gates so to speak, but I don’t think as a community it is a big problem
> because we can still say that you should not merge if the StressNewTest
> fails because of your test.
> >
> > I personally find the false sense of security more troubling than
> anything. Hence the reason, I would like this to be
> >
> > Let me know what you think..
> >
> > Thanks,
> > Mark
>
>

-- 
Ju@N


Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Owen Nichols
Thanks Juan, this is fantastic work discovering and mitigating this!

The Benchmarks tests in our pipeline are supposed to catch performance 
degradations exceeding 5%.  We should all be concerned, how did a 7% 
degradation slip through?

> On Feb 28, 2020, at 3:48 AM, Ju@N  wrote:
> 
> Hello devs,
> 
> I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
> The change avoid unnecessary transformations between java collections and
> primitive arrays for every message sent within a Geode cluster (see the
> Geode tickets for further details).
> The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
> My apologies for keep asking to merge changes into an already cut release,
> we found a performance degradation (around 7%) within our internal testings
> and we're trying to solve it step by step before version 1.12 is released
> (we are 3% away from the baseline, Geode 1.10, right now).
> Best regards.
> 
> [1]: https://issues.apache.org/jira/browse/GEODE-7820
> [2]:
> https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769
> 
> -- 
> Ju@N



Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Just to make sure we are clear, I am not suggesting that we disable 
stressnewtest, but that we make it not required. It would still run and provide 
feedback, but it would not give us an unwarranted green in my approach.

> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> 
> +1 to what Owen said, I don't think disabling *StressNewTest* is a
> good idea.
> 
> On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
> 
>> -1 to making StressNew not required
>> 
>> +1 to eliminating the current loophole — StressNew should never give a
>> free pass.
>> 
>> Any time your PR is having trouble passing StressNew, please bring it up
>> on the dev list. We can review on a case-by-case basis and decide whether
>> to try increasing the timeout, changing the repeat count, refactoring the
>> PR, or as an absolute last resort requesting authorization for an override
>> (for example, a change to spotless rules might touch a huge number of files
>> but carry no risk).
>> 
>> One bug we should fix is that StressNew sometimes counts more files
>> touched than really were, especially if you had many commits or merges or
>> rebases on your PR branch.  Possible workarounds there include squashing
>> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
>> some time trying to reproduce why files are mis-counted, with no success,
>> but perhaps someone cleverer with git could provide a fix.
>> 
>> Another issue is that StressNew is only in the PR pipeline, not the main
>> develop pipeline.  This feels like an asymmetry where PRs must pass a
>> “higher” standard.  We should consider adding some form of StressNew to the
>> main pipeline as well (maybe compare to the previous SHA that passed?).
>> 
>> The original motivation for the 25-file limit was an attempt to limit how
>> long StressNew might run for.  Since concourse already applies a timeout,
>> that check is unnecessary.  However, a compromise solution might be to use
>> the number of files changed to dial back the number of repeats, e.g. stay
>> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
>> <#-files-changed> and do only that many repeats (e.g. if 50 files changed,
>> run all tests 25x instead of 50x).
>> 
>> While StressNew is intended to catch new flaky tests, it can also catch
>> poorly-designed tests that fail just by running twice in the same VM.  This
>> may be a sign that the test does not clean up properly and could be
>> polluting other tests in unexpected ways?  It might be useful to run a
>> “StressNew” with repeats=2 over a much broader scope, maybe even all tests,
>> at least once in a while?
>> 
>> 
>>> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
>>> 
>>> Hi All,
>>> 
>>> Proposal: Force StressNewTest to fail a change with 25 or more files
>> rather than pass it without running it.
>>> 
>>> Currently, the StressNewTest job in the pipeline will just pass a job
>> that has more than 25 files changed. It will be marked as green with no
>> work done. There are reasons, relating to run time being too long to be
>> tracked by concourse, so we just let it through as a pass. I think this is
>> a bad signal. I think that this should automatically be a failure in the
>> short term. As a result, I also think it should not be required. It is a
>> bad signal, and I think that by making it a fail, this will at least not
>> give us a false sense of security. I understand that this opens the flood
>> gates so to speak, but I don’t think as a community it is a big problem
>> because we can still say that you should not merge if the StressNewTest
>> fails because of your test.
>>> 
>>> I personally find the false sense of security more troubling than
>> anything. Hence the reason, I would like this to be
>>> 
>>> Let me know what you think..
>>> 
>>> Thanks,
>>> Mark
>> 
>> 
> 
> -- 
> Ju@N



Re: [PROPOSAL]: Include GEODE-7820 in Release 1.12.0

2020-02-28 Thread Ju@N
Hello Owen,

Unfortunately I don't have the answer to that question... I
never worked through the Geode Benchmarks so I don't know exactly what
they're testing/validating.
Best regards.

On Fri, 28 Feb 2020 at 18:52, Owen Nichols  wrote:

> Thanks Juan, this is fantastic work discovering and mitigating this!
>
> The Benchmarks tests in our pipeline are supposed to catch performance
> degradations exceeding 5%.  We should all be concerned, how did a 7%
> degradation slip through?
>
> > On Feb 28, 2020, at 3:48 AM, Ju@N  wrote:
> >
> > Hello devs,
> >
> > I'd like to include the fix for GEODE-7820 [1] in release 1.12.0.
> > The change avoid unnecessary transformations between java collections and
> > primitive arrays for every message sent within a Geode cluster (see the
> > Geode tickets for further details).
> > The SHA is ca7ccbce73d436005fe027f31ee910ee9beeb769 [2].
> > My apologies for keep asking to merge changes into an already cut
> release,
> > we found a performance degradation (around 7%) within our internal
> testings
> > and we're trying to solve it step by step before version 1.12 is released
> > (we are 3% away from the baseline, Geode 1.10, right now).
> > Best regards.
> >
> > [1]: https://issues.apache.org/jira/browse/GEODE-7820
> > [2]:
> >
> https://github.com/apache/geode/commit/ca7ccbce73d436005fe027f31ee910ee9beeb769
> >
> > --
> > Ju@N
>
>

-- 
Ju@N


Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Ernest Burghardt
-1 to limiting any tests... if there are issues with the tests let's fix
that.  we have too many commits coming in with little or no testing over
new/changed code, so I can't see how removing any existing test coverage as
a good idea

Best,
EB

On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:

> Just to make sure we are clear, I am not suggesting that we disable
> stressnewtest, but that we make it not required. It would still run and
> provide feedback, but it would not give us an unwarranted green in my
> approach.
>
> > On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> >
> > +1 to what Owen said, I don't think disabling *StressNewTest* is a
> > good idea.
> >
> > On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
> >
> >> -1 to making StressNew not required
> >>
> >> +1 to eliminating the current loophole — StressNew should never give a
> >> free pass.
> >>
> >> Any time your PR is having trouble passing StressNew, please bring it up
> >> on the dev list. We can review on a case-by-case basis and decide
> whether
> >> to try increasing the timeout, changing the repeat count, refactoring
> the
> >> PR, or as an absolute last resort requesting authorization for an
> override
> >> (for example, a change to spotless rules might touch a huge number of
> files
> >> but carry no risk).
> >>
> >> One bug we should fix is that StressNew sometimes counts more files
> >> touched than really were, especially if you had many commits or merges
> or
> >> rebases on your PR branch.  Possible workarounds there include squashing
> >> and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
> >> some time trying to reproduce why files are mis-counted, with no
> success,
> >> but perhaps someone cleverer with git could provide a fix.
> >>
> >> Another issue is that StressNew is only in the PR pipeline, not the main
> >> develop pipeline.  This feels like an asymmetry where PRs must pass a
> >> “higher” standard.  We should consider adding some form of StressNew to
> the
> >> main pipeline as well (maybe compare to the previous SHA that passed?).
> >>
> >> The original motivation for the 25-file limit was an attempt to limit
> how
> >> long StressNew might run for.  Since concourse already applies a
> timeout,
> >> that check is unnecessary.  However, a compromise solution might be to
> use
> >> the number of files changed to dial back the number of repeats, e.g.
> stay
> >> with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
> >> <#-files-changed> and do only that many repeats (e.g. if 50 files
> changed,
> >> run all tests 25x instead of 50x).
> >>
> >> While StressNew is intended to catch new flaky tests, it can also catch
> >> poorly-designed tests that fail just by running twice in the same VM.
> This
> >> may be a sign that the test does not clean up properly and could be
> >> polluting other tests in unexpected ways?  It might be useful to run a
> >> “StressNew” with repeats=2 over a much broader scope, maybe even all
> tests,
> >> at least once in a while?
> >>
> >>
> >>> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >>>
> >>> Hi All,
> >>>
> >>> Proposal: Force StressNewTest to fail a change with 25 or more files
> >> rather than pass it without running it.
> >>>
> >>> Currently, the StressNewTest job in the pipeline will just pass a job
> >> that has more than 25 files changed. It will be marked as green with no
> >> work done. There are reasons, relating to run time being too long to be
> >> tracked by concourse, so we just let it through as a pass. I think this
> is
> >> a bad signal. I think that this should automatically be a failure in the
> >> short term. As a result, I also think it should not be required. It is a
> >> bad signal, and I think that by making it a fail, this will at least not
> >> give us a false sense of security. I understand that this opens the
> flood
> >> gates so to speak, but I don’t think as a community it is a big problem
> >> because we can still say that you should not merge if the StressNewTest
> >> fails because of your test.
> >>>
> >>> I personally find the false sense of security more troubling than
> >> anything. Hence the reason, I would like this to be
> >>>
> >>> Let me know what you think..
> >>>
> >>> Thanks,
> >>> Mark
> >>
> >>
> >
> > --
> > Ju@N
>
>


[PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Bruce Schuchardt
During a refactor the default ack-wait-threshold was changed from 15 to 51.  
This will affect any deployment that doesn’t set its own threshold.

The ack-wait-threshold determines how long we wait for a response to a message, 
such as replication of a put(), before taking some action.


Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Mark Hanson
Alright, so basically it seems like people are not ok with the not requiring 
stressnewtest approach. So I guess that means that we need to identify -1s 
willing to help resolve this problem…

Who would like to help?

Thanks,
Mark

> On Feb 28, 2020, at 11:12 AM, Ernest Burghardt  wrote:
> 
> -1 to limiting any tests... if there are issues with the tests let's fix
> that.  we have too many commits coming in with little or no testing over
> new/changed code, so I can't see how removing any existing test coverage as
> a good idea
> 
> Best,
> EB
> 
> On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:
> 
>> Just to make sure we are clear, I am not suggesting that we disable
>> stressnewtest, but that we make it not required. It would still run and
>> provide feedback, but it would not give us an unwarranted green in my
>> approach.
>> 
>>> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
>>> 
>>> +1 to what Owen said, I don't think disabling *StressNewTest* is a
>>> good idea.
>>> 
>>> On Fri, 28 Feb 2020 at 18:35, Owen Nichols  wrote:
>>> 
 -1 to making StressNew not required
 
 +1 to eliminating the current loophole — StressNew should never give a
 free pass.
 
 Any time your PR is having trouble passing StressNew, please bring it up
 on the dev list. We can review on a case-by-case basis and decide
>> whether
 to try increasing the timeout, changing the repeat count, refactoring
>> the
 PR, or as an absolute last resort requesting authorization for an
>> override
 (for example, a change to spotless rules might touch a huge number of
>> files
 but carry no risk).
 
 One bug we should fix is that StressNew sometimes counts more files
 touched than really were, especially if you had many commits or merges
>> or
 rebases on your PR branch.  Possible workarounds there include squashing
 and/or creating a new PR and/or splitting into multiple PRs.  I’ve spent
 some time trying to reproduce why files are mis-counted, with no
>> success,
 but perhaps someone cleverer with git could provide a fix.
 
 Another issue is that StressNew is only in the PR pipeline, not the main
 develop pipeline.  This feels like an asymmetry where PRs must pass a
 “higher” standard.  We should consider adding some form of StressNew to
>> the
 main pipeline as well (maybe compare to the previous SHA that passed?).
 
 The original motivation for the 25-file limit was an attempt to limit
>> how
 long StressNew might run for.  Since concourse already applies a
>> timeout,
 that check is unnecessary.  However, a compromise solution might be to
>> use
 the number of files changed to dial back the number of repeats, e.g.
>> stay
 with 50 repeats if fewer than 25 files changed, otherwise compute 1250 /
 <#-files-changed> and do only that many repeats (e.g. if 50 files
>> changed,
 run all tests 25x instead of 50x).
 
 While StressNew is intended to catch new flaky tests, it can also catch
 poorly-designed tests that fail just by running twice in the same VM.
>> This
 may be a sign that the test does not clean up properly and could be
 polluting other tests in unexpected ways?  It might be useful to run a
 “StressNew” with repeats=2 over a much broader scope, maybe even all
>> tests,
 at least once in a while?
 
 
> On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> 
> Hi All,
> 
> Proposal: Force StressNewTest to fail a change with 25 or more files
 rather than pass it without running it.
> 
> Currently, the StressNewTest job in the pipeline will just pass a job
 that has more than 25 files changed. It will be marked as green with no
 work done. There are reasons, relating to run time being too long to be
 tracked by concourse, so we just let it through as a pass. I think this
>> is
 a bad signal. I think that this should automatically be a failure in the
 short term. As a result, I also think it should not be required. It is a
 bad signal, and I think that by making it a fail, this will at least not
 give us a false sense of security. I understand that this opens the
>> flood
 gates so to speak, but I don’t think as a community it is a big problem
 because we can still say that you should not merge if the StressNewTest
 fails because of your test.
> 
> I personally find the false sense of security more troubling than
 anything. Hence the reason, I would like this to be
> 
> Let me know what you think..
> 
> Thanks,
> Mark
 
 
>>> 
>>> --
>>> Ju@N
>> 
>> 



Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Bill Burcham
I propose we deprecate Geode’s proprietary UDP message privacy algorithm
based on the Diffie-Hellman key exchange protocol. This would deprecate:

ConfigurationProperties.SECURITY_UDP_DHALGO

String DistributionConfig.getSecurityUDPDHAlgo()

void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
DistributionConfig.SECURITY_UDP_DHALGO_NAME

Additionally we’d have to upate documentation to reflect deprecation.

>From ConfigurationProperties.java:


Application can set this property to valid symmetric key algorithm, to
encrypt udp messages in Geode. Geode will generate symmetric key using
Diffie-Hellman key exchange algorithm between peers. That key further used
by specified algorithm to encrypt the udp messages.

The property (and the feature) was added mid-2016. Unfortunately it was not
added as an “experimental” feature, so it cannot simply be removed.

Incidentally, the corresponding property for client-server communication,
SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
1.5 in favor of SSL/TLS.

I am proposing deprecating the feature because:


   1.

   The feature has not proven popular with users.
   2.

   At least one hard-to-reproduce bug exists in the implementation:
   GEODE-6448 . We’ve
   burned a person-week trying to fix the problem (Bruce Schuchardt and me)
   and it’s not clear how much more time it will take. If we decide to
   deprecate the feature, fixing this problem would be de-prioritized
   accordingly.
   3.

   If we decide, in the future, that UDP message security is required, it
   would be better to implement a standard algorithm such as DTLS
   :
   1.

  Our algorithm provides only message privacy whereas DTLS provides
  privacy, tamper-resistance, and message forgery protection
  2.

  DTLS is a standard
  3.

  There is some support for DTLS in the JDK (JEP-219
   delivered in JDK 9). It’s not a
  complete implementation e.g. guaranteed delivery is a do-it-yourself kit.


Actually implementing DTLS is out of scope for this proposal. Adding DTLS
would be a significant undertaking.

So, how do you feel about me making a GEODE ticket to deprecate the
SECURITY_UDP_DHALGO configuration property?


Re: [PROPOSAL] StressNewTest in Pull Request Pipeline to be made optional.

2020-02-28 Thread Nabarun Nag
What help is needed in this effort?

Regards
Naba

On Fri, Feb 28, 2020 at 11:35 AM Mark Hanson  wrote:

> Alright, so basically it seems like people are not ok with the not
> requiring stressnewtest approach. So I guess that means that we need to
> identify -1s willing to help resolve this problem…
>
> Who would like to help?
>
> Thanks,
> Mark
>
> > On Feb 28, 2020, at 11:12 AM, Ernest Burghardt 
> wrote:
> >
> > -1 to limiting any tests... if there are issues with the tests let's fix
> > that.  we have too many commits coming in with little or no testing over
> > new/changed code, so I can't see how removing any existing test coverage
> as
> > a good idea
> >
> > Best,
> > EB
> >
> > On Fri, Feb 28, 2020 at 10:58 AM Mark Hanson  wrote:
> >
> >> Just to make sure we are clear, I am not suggesting that we disable
> >> stressnewtest, but that we make it not required. It would still run and
> >> provide feedback, but it would not give us an unwarranted green in my
> >> approach.
> >>
> >>> On Feb 28, 2020, at 10:49 AM, Ju@N  wrote:
> >>>
> >>> +1 to what Owen said, I don't think disabling *StressNewTest* is a
> >>> good idea.
> >>>
> >>> On Fri, 28 Feb 2020 at 18:35, Owen Nichols 
> wrote:
> >>>
>  -1 to making StressNew not required
> 
>  +1 to eliminating the current loophole — StressNew should never give a
>  free pass.
> 
>  Any time your PR is having trouble passing StressNew, please bring it
> up
>  on the dev list. We can review on a case-by-case basis and decide
> >> whether
>  to try increasing the timeout, changing the repeat count, refactoring
> >> the
>  PR, or as an absolute last resort requesting authorization for an
> >> override
>  (for example, a change to spotless rules might touch a huge number of
> >> files
>  but carry no risk).
> 
>  One bug we should fix is that StressNew sometimes counts more files
>  touched than really were, especially if you had many commits or merges
> >> or
>  rebases on your PR branch.  Possible workarounds there include
> squashing
>  and/or creating a new PR and/or splitting into multiple PRs.  I’ve
> spent
>  some time trying to reproduce why files are mis-counted, with no
> >> success,
>  but perhaps someone cleverer with git could provide a fix.
> 
>  Another issue is that StressNew is only in the PR pipeline, not the
> main
>  develop pipeline.  This feels like an asymmetry where PRs must pass a
>  “higher” standard.  We should consider adding some form of StressNew
> to
> >> the
>  main pipeline as well (maybe compare to the previous SHA that
> passed?).
> 
>  The original motivation for the 25-file limit was an attempt to limit
> >> how
>  long StressNew might run for.  Since concourse already applies a
> >> timeout,
>  that check is unnecessary.  However, a compromise solution might be to
> >> use
>  the number of files changed to dial back the number of repeats, e.g.
> >> stay
>  with 50 repeats if fewer than 25 files changed, otherwise compute
> 1250 /
>  <#-files-changed> and do only that many repeats (e.g. if 50 files
> >> changed,
>  run all tests 25x instead of 50x).
> 
>  While StressNew is intended to catch new flaky tests, it can also
> catch
>  poorly-designed tests that fail just by running twice in the same VM.
> >> This
>  may be a sign that the test does not clean up properly and could be
>  polluting other tests in unexpected ways?  It might be useful to run a
>  “StressNew” with repeats=2 over a much broader scope, maybe even all
> >> tests,
>  at least once in a while?
> 
> 
> > On Feb 28, 2020, at 9:51 AM, Mark Hanson  wrote:
> >
> > Hi All,
> >
> > Proposal: Force StressNewTest to fail a change with 25 or more files
>  rather than pass it without running it.
> >
> > Currently, the StressNewTest job in the pipeline will just pass a job
>  that has more than 25 files changed. It will be marked as green with
> no
>  work done. There are reasons, relating to run time being too long to
> be
>  tracked by concourse, so we just let it through as a pass. I think
> this
> >> is
>  a bad signal. I think that this should automatically be a failure in
> the
>  short term. As a result, I also think it should not be required. It
> is a
>  bad signal, and I think that by making it a fail, this will at least
> not
>  give us a false sense of security. I understand that this opens the
> >> flood
>  gates so to speak, but I don’t think as a community it is a big
> problem
>  because we can still say that you should not merge if the
> StressNewTest
>  fails because of your test.
> >
> > I personally find the false sense of security more troubling than
>  anything. Hence the reason, I would like this to be
> >
> > Let me know what you think..
> >
> > Thanks,
> > Mark
> 
> 
> >>>
> >>> -

Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Owen Nichols
+1

We should have done this as soon as SSL/TLS was ready.  Better late than never!

While we normally wait until a major release to remove deprecated stuff, there 
is some precedent for removing insecure encryption algorithms sooner (Java has 
done this even in patch releases).  We should consider getting the deprecation 
notice into 1.12 and removing in 1.13, rather than waiting for 2.0.

> On Feb 28, 2020, at 11:42 AM, Bill Burcham  wrote:
> 
> I propose we deprecate Geode’s proprietary UDP message privacy algorithm
> based on the Diffie-Hellman key exchange protocol. This would deprecate:
> 
> ConfigurationProperties.SECURITY_UDP_DHALGO
> 
> String DistributionConfig.getSecurityUDPDHAlgo()
> 
> void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
> DistributionConfig.SECURITY_UDP_DHALGO_NAME
> 
> Additionally we’d have to upate documentation to reflect deprecation.
> 
> From ConfigurationProperties.java:
> 
> 
> Application can set this property to valid symmetric key algorithm, to
> encrypt udp messages in Geode. Geode will generate symmetric key using
> Diffie-Hellman key exchange algorithm between peers. That key further used
> by specified algorithm to encrypt the udp messages.
> 
> The property (and the feature) was added mid-2016. Unfortunately it was not
> added as an “experimental” feature, so it cannot simply be removed.
> 
> Incidentally, the corresponding property for client-server communication,
> SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
> 1.5 in favor of SSL/TLS.
> 
> I am proposing deprecating the feature because:
> 
> 
>   1.
> 
>   The feature has not proven popular with users.
>   2.
> 
>   At least one hard-to-reproduce bug exists in the implementation:
>   GEODE-6448 . We’ve
>   burned a person-week trying to fix the problem (Bruce Schuchardt and me)
>   and it’s not clear how much more time it will take. If we decide to
>   deprecate the feature, fixing this problem would be de-prioritized
>   accordingly.
>   3.
> 
>   If we decide, in the future, that UDP message security is required, it
>   would be better to implement a standard algorithm such as DTLS
>   :
>   1.
> 
>  Our algorithm provides only message privacy whereas DTLS provides
>  privacy, tamper-resistance, and message forgery protection
>  2.
> 
>  DTLS is a standard
>  3.
> 
>  There is some support for DTLS in the JDK (JEP-219
>   delivered in JDK 9). It’s not a
>  complete implementation e.g. guaranteed delivery is a do-it-yourself kit.
> 
> 
> Actually implementing DTLS is out of scope for this proposal. Adding DTLS
> would be a significant undertaking.
> 
> So, how do you feel about me making a GEODE ticket to deprecate the
> SECURITY_UDP_DHALGO configuration property?



[PROPOSAL] eliminate file count loophole in PR StressNewTest

2020-02-28 Thread Owen Nichols
Currently, StressNewTest will turn green WITHOUT RUNNING ANY TESTS if it thinks 
(often wrongly) that more than 25 test files were touched.

This is both dishonest (it can give a false green) AND totally unnecessary 
(concourse already applies a timeout to StressNewTest, currently set at 6 
hours, so there is no reason not to at least try to run all changed tests, they 
may easily complete in under 6 hours even if many files).

*** I PROPOSE that we should remove this 25-file-free-pass loophole. ***
 
As always, if anyone is having trouble getting their PR to pass StressNew, 
please bring it up on the dev list and we can discuss the appropriate solution 
on a case-by-case basis (e.g. increasing timeout, fixing the logic that 
identifies changes test files, splitting into multiple PRs, etc).



Contributor Access Request

2020-02-28 Thread Derrick Anderson
Hi!

I am planning on contributing and would like contributor access in Jira.

My username is deanderson

Thank you so much for your help


Re: Contributor Access Request

2020-02-28 Thread Owen Nichols
Welcome Derrick, you should have access to Geode Jira now

https://issues.apache.org/jira/projects/GEODE/issues


> On Feb 26, 2020, at 8:41 AM, Derrick Anderson  
> wrote:
> 
> Hi!
> 
> I am planning on contributing and would like contributor access in Jira.
> 
> My username is deanderson
> 
> Thank you so much for your help



Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Ernest Burghardt
+1

seems reasonable to do this for 1.12 and be ahead of the game, @Owen would
you please spawn that as a separate release 1.12 thread? thanks, eB

On Fri, Feb 28, 2020 at 11:56 AM Owen Nichols  wrote:

> +1
>
> We should have done this as soon as SSL/TLS was ready.  Better late than
> never!
>
> While we normally wait until a major release to remove deprecated stuff,
> there is some precedent for removing insecure encryption algorithms sooner
> (Java has done this even in patch releases).  We should consider getting
> the deprecation notice into 1.12 and removing in 1.13, rather than waiting
> for 2.0.
>
> > On Feb 28, 2020, at 11:42 AM, Bill Burcham 
> wrote:
> >
> > I propose we deprecate Geode’s proprietary UDP message privacy algorithm
> > based on the Diffie-Hellman key exchange protocol. This would deprecate:
> >
> > ConfigurationProperties.SECURITY_UDP_DHALGO
> >
> > String DistributionConfig.getSecurityUDPDHAlgo()
> >
> > void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
> > DistributionConfig.SECURITY_UDP_DHALGO_NAME
> >
> > Additionally we’d have to upate documentation to reflect deprecation.
> >
> > From ConfigurationProperties.java:
> >
> >
> > Application can set this property to valid symmetric key algorithm, to
> > encrypt udp messages in Geode. Geode will generate symmetric key using
> > Diffie-Hellman key exchange algorithm between peers. That key further
> used
> > by specified algorithm to encrypt the udp messages.
> >
> > The property (and the feature) was added mid-2016. Unfortunately it was
> not
> > added as an “experimental” feature, so it cannot simply be removed.
> >
> > Incidentally, the corresponding property for client-server communication,
> > SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
> > 1.5 in favor of SSL/TLS.
> >
> > I am proposing deprecating the feature because:
> >
> >
> >   1.
> >
> >   The feature has not proven popular with users.
> >   2.
> >
> >   At least one hard-to-reproduce bug exists in the implementation:
> >   GEODE-6448 . We’ve
> >   burned a person-week trying to fix the problem (Bruce Schuchardt and
> me)
> >   and it’s not clear how much more time it will take. If we decide to
> >   deprecate the feature, fixing this problem would be de-prioritized
> >   accordingly.
> >   3.
> >
> >   If we decide, in the future, that UDP message security is required, it
> >   would be better to implement a standard algorithm such as DTLS
> >   :
> >   1.
> >
> >  Our algorithm provides only message privacy whereas DTLS provides
> >  privacy, tamper-resistance, and message forgery protection
> >  2.
> >
> >  DTLS is a standard
> >  3.
> >
> >  There is some support for DTLS in the JDK (JEP-219
> >   delivered in JDK 9). It’s not a
> >  complete implementation e.g. guaranteed delivery is a
> do-it-yourself kit.
> >
> >
> > Actually implementing DTLS is out of scope for this proposal. Adding DTLS
> > would be a significant undertaking.
> >
> > So, how do you feel about me making a GEODE ticket to deprecate the
> > SECURITY_UDP_DHALGO configuration property?
>
>


Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Bruce Schuchardt
+1

On 2/28/20, 11:43 AM, "Bill Burcham"  wrote:

I propose we deprecate Geode’s proprietary UDP message privacy algorithm
based on the Diffie-Hellman key exchange protocol. This would deprecate:

ConfigurationProperties.SECURITY_UDP_DHALGO

String DistributionConfig.getSecurityUDPDHAlgo()

void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
DistributionConfig.SECURITY_UDP_DHALGO_NAME

Additionally we’d have to upate documentation to reflect deprecation.

From ConfigurationProperties.java:


Application can set this property to valid symmetric key algorithm, to
encrypt udp messages in Geode. Geode will generate symmetric key using
Diffie-Hellman key exchange algorithm between peers. That key further used
by specified algorithm to encrypt the udp messages.

The property (and the feature) was added mid-2016. Unfortunately it was not
added as an “experimental” feature, so it cannot simply be removed.

Incidentally, the corresponding property for client-server communication,
SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
1.5 in favor of SSL/TLS.

I am proposing deprecating the feature because:


   1.

   The feature has not proven popular with users.
   2.

   At least one hard-to-reproduce bug exists in the implementation:
   GEODE-6448 
.
 We’ve
   burned a person-week trying to fix the problem (Bruce Schuchardt and me)
   and it’s not clear how much more time it will take. If we decide to
   deprecate the feature, fixing this problem would be de-prioritized
   accordingly.
   3.

   If we decide, in the future, that UDP message security is required, it
   would be better to implement a standard algorithm such as DTLS
   
:
   1.

  Our algorithm provides only message privacy whereas DTLS provides
  privacy, tamper-resistance, and message forgery protection
  2.

  DTLS is a standard
  3.

  There is some support for DTLS in the JDK (JEP-219
  

 delivered in JDK 9). It’s not a
  complete implementation e.g. guaranteed delivery is a do-it-yourself 
kit.


Actually implementing DTLS is out of scope for this proposal. Adding DTLS
would be a significant undertaking.

So, how do you feel about me making a GEODE ticket to deprecate the
SECURITY_UDP_DHALGO configuration property?




Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Udo Kohlmeyer
+1 to adding the deprecation. But I would prefer that we give users more 
notice than 3 months.


maybe we deprecate it in 1.14

--Udo

On 2/28/20 1:00 PM, Ernest Burghardt wrote:

+1

seems reasonable to do this for 1.12 and be ahead of the game, @Owen would
you please spawn that as a separate release 1.12 thread? thanks, eB

On Fri, Feb 28, 2020 at 11:56 AM Owen Nichols  wrote:


+1

We should have done this as soon as SSL/TLS was ready.  Better late than
never!

While we normally wait until a major release to remove deprecated stuff,
there is some precedent for removing insecure encryption algorithms sooner
(Java has done this even in patch releases).  We should consider getting
the deprecation notice into 1.12 and removing in 1.13, rather than waiting
for 2.0.


On Feb 28, 2020, at 11:42 AM, Bill Burcham 

wrote:

I propose we deprecate Geode’s proprietary UDP message privacy algorithm
based on the Diffie-Hellman key exchange protocol. This would deprecate:

ConfigurationProperties.SECURITY_UDP_DHALGO

String DistributionConfig.getSecurityUDPDHAlgo()

void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
DistributionConfig.SECURITY_UDP_DHALGO_NAME

Additionally we’d have to upate documentation to reflect deprecation.

 From ConfigurationProperties.java:


Application can set this property to valid symmetric key algorithm, to
encrypt udp messages in Geode. Geode will generate symmetric key using
Diffie-Hellman key exchange algorithm between peers. That key further

used

by specified algorithm to encrypt the udp messages.

The property (and the feature) was added mid-2016. Unfortunately it was

not

added as an “experimental” feature, so it cannot simply be removed.

Incidentally, the corresponding property for client-server communication,
SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
1.5 in favor of SSL/TLS.

I am proposing deprecating the feature because:


   1.

   The feature has not proven popular with users.
   2.

   At least one hard-to-reproduce bug exists in the implementation:
   GEODE-6448 . We’ve
   burned a person-week trying to fix the problem (Bruce Schuchardt and

me)

   and it’s not clear how much more time it will take. If we decide to
   deprecate the feature, fixing this problem would be de-prioritized
   accordingly.
   3.

   If we decide, in the future, that UDP message security is required, it
   would be better to implement a standard algorithm such as DTLS
   :
   1.

  Our algorithm provides only message privacy whereas DTLS provides
  privacy, tamper-resistance, and message forgery protection
  2.

  DTLS is a standard
  3.

  There is some support for DTLS in the JDK (JEP-219
   delivered in JDK 9). It’s not a
  complete implementation e.g. guaranteed delivery is a

do-it-yourself kit.


Actually implementing DTLS is out of scope for this proposal. Adding DTLS
would be a significant undertaking.

So, how do you feel about me making a GEODE ticket to deprecate the
SECURITY_UDP_DHALGO configuration property?




Re: [PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Ernest Burghardt
bumping this out to DevList as something went wrong in the mail system...

On 2020/02/28 19:27:42, Bruce Schuchardt  wrote: 
> During a refactor the default ack-wait-threshold was changed from 15 to 51.  
> This will affect any deployment that doesn’t set its own threshold.
> 
> The ack-wait-threshold determines how long we wait for a response to a 
> message, such as replication of a put(), before taking some action.
> 


test message - please ignore

2020-02-28 Thread Bruce Schuchardt
I’m switching to a new email account.


Re: test message - please ignore

2020-02-28 Thread Ernest Burghardt
got it... domain still looks weird

On Fri, Feb 28, 2020 at 1:10 PM Bruce Schuchardt 
wrote:

> I’m switching to a new email account.
>


Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Owen Nichols
Udo, are you proposing to deprecate in 1.12 and remove in 1.14?

> On Feb 28, 2020, at 1:03 PM, Udo Kohlmeyer  wrote:
> 
> +1 to adding the deprecation. But I would prefer that we give users more 
> notice than 3 months.
> 
> maybe we deprecate it in 1.14
> 
> --Udo
> 
> On 2/28/20 1:00 PM, Ernest Burghardt wrote:
>> +1
>> 
>> seems reasonable to do this for 1.12 and be ahead of the game, @Owen would
>> you please spawn that as a separate release 1.12 thread? thanks, eB
>> 
>> On Fri, Feb 28, 2020 at 11:56 AM Owen Nichols  wrote:
>> 
>>> +1
>>> 
>>> We should have done this as soon as SSL/TLS was ready.  Better late than
>>> never!
>>> 
>>> While we normally wait until a major release to remove deprecated stuff,
>>> there is some precedent for removing insecure encryption algorithms sooner
>>> (Java has done this even in patch releases).  We should consider getting
>>> the deprecation notice into 1.12 and removing in 1.13, rather than waiting
>>> for 2.0.
>>> 
 On Feb 28, 2020, at 11:42 AM, Bill Burcham 
>>> wrote:
 I propose we deprecate Geode’s proprietary UDP message privacy algorithm
 based on the Diffie-Hellman key exchange protocol. This would deprecate:
 
 ConfigurationProperties.SECURITY_UDP_DHALGO
 
 String DistributionConfig.getSecurityUDPDHAlgo()
 
 void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
 DistributionConfig.SECURITY_UDP_DHALGO_NAME
 
 Additionally we’d have to upate documentation to reflect deprecation.
 
 From ConfigurationProperties.java:
 
 
 Application can set this property to valid symmetric key algorithm, to
 encrypt udp messages in Geode. Geode will generate symmetric key using
 Diffie-Hellman key exchange algorithm between peers. That key further
>>> used
 by specified algorithm to encrypt the udp messages.
 
 The property (and the feature) was added mid-2016. Unfortunately it was
>>> not
 added as an “experimental” feature, so it cannot simply be removed.
 
 Incidentally, the corresponding property for client-server communication,
 SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
 1.5 in favor of SSL/TLS.
 
 I am proposing deprecating the feature because:
 
 
   1.
 
   The feature has not proven popular with users.
   2.
 
   At least one hard-to-reproduce bug exists in the implementation:
   GEODE-6448 . We’ve
   burned a person-week trying to fix the problem (Bruce Schuchardt and
>>> me)
   and it’s not clear how much more time it will take. If we decide to
   deprecate the feature, fixing this problem would be de-prioritized
   accordingly.
   3.
 
   If we decide, in the future, that UDP message security is required, it
   would be better to implement a standard algorithm such as DTLS
   :
   1.
 
  Our algorithm provides only message privacy whereas DTLS provides
  privacy, tamper-resistance, and message forgery protection
  2.
 
  DTLS is a standard
  3.
 
  There is some support for DTLS in the JDK (JEP-219
   delivered in JDK 9). It’s not a
  complete implementation e.g. guaranteed delivery is a
>>> do-it-yourself kit.
 
 Actually implementing DTLS is out of scope for this proposal. Adding DTLS
 would be a significant undertaking.
 
 So, how do you feel about me making a GEODE ticket to deprecate the
 SECURITY_UDP_DHALGO configuration property?
>>> 



[PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Ernest Burghardt
Hello devs,


I'd like to include the fix for GEODE-782 [1] in release 1.12.0.

During a refactor the default ack-wait-threshold was changed from 15
to 51.  This will affect any deployment that doesn’t set its own
threshold.

The ack-wait-threshold determines how long we wait for a response to a
message, such as replication of a put(), before taking some action.
(see the  Geode tickets for further details).

The SHA is dfa3326e674278f38673c9508a13af752ae90e28 [2].


Best regards.


[1]: https://issues.apache.org/jira/browse/GEODE-7829

[2]:
https://github.com/apache/geode/commit/dfa3326e674278f38673c9508a13af752ae90e28


Re: [PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Owen Nichols
+1 for bringing this fix to 1.12 (assuming the change was unintentional?)

Any ideas how much performance improvement this is anticipated to restore?

> On Feb 28, 2020, at 1:06 PM, Ernest Burghardt  wrote:
> 
> bumping this out to DevList as something went wrong in the mail system...
> 
> On 2020/02/28 19:27:42, Bruce Schuchardt  wrote: 
>> During a refactor the default ack-wait-threshold was changed from 15 to 51.  
>> This will affect any deployment that doesn’t set its own threshold.
>> 
>> The ack-wait-threshold determines how long we wait for a response to a 
>> message, such as replication of a put(), before taking some action.
>> 



Re: [PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Ernest Burghardt
yes, it was just a transposition/typo when the defaults were
refactored/moved

On Fri, Feb 28, 2020 at 1:14 PM Owen Nichols  wrote:

> +1 for bringing this fix to 1.12 (assuming the change was unintentional?)
>
> Any ideas how much performance improvement this is anticipated to restore?
>
> > On Feb 28, 2020, at 1:06 PM, Ernest Burghardt 
> wrote:
> >
> > bumping this out to DevList as something went wrong in the mail system...
> >
> > On 2020/02/28 19:27:42, Bruce Schuchardt 
> wrote:
> >> During a refactor the default ack-wait-threshold was changed from 15 to
> 51.  This will affect any deployment that doesn’t set its own threshold.
> >>
> >> The ack-wait-threshold determines how long we wait for a response to a
> message, such as replication of a put(), before taking some action.
> >>
>
>


[IGNORE] another test message - sorry for the spam

2020-02-28 Thread Bruce Schuchardt



Re: [PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Donal Evans
+1

On Fri, Feb 28, 2020 at 1:15 PM Ernest Burghardt 
wrote:

> yes, it was just a transposition/typo when the defaults were
> refactored/moved
>
> On Fri, Feb 28, 2020 at 1:14 PM Owen Nichols  wrote:
>
> > +1 for bringing this fix to 1.12 (assuming the change was unintentional?)
> >
> > Any ideas how much performance improvement this is anticipated to
> restore?
> >
> > > On Feb 28, 2020, at 1:06 PM, Ernest Burghardt 
> > wrote:
> > >
> > > bumping this out to DevList as something went wrong in the mail
> system...
> > >
> > > On 2020/02/28 19:27:42, Bruce Schuchardt 
> > wrote:
> > >> During a refactor the default ack-wait-threshold was changed from 15
> to
> > 51.  This will affect any deployment that doesn’t set its own threshold.
> > >>
> > >> The ack-wait-threshold determines how long we wait for a response to a
> > message, such as replication of a put(), before taking some action.
> > >>
> >
> >
>


Re: Contributor Access Request

2020-02-28 Thread Anthony Baker
Welcome Derrick!  Please shout out if you need any help.

Anthony


> On Feb 28, 2020, at 12:53 PM, Owen Nichols  wrote:
> 
> Welcome Derrick, you should have access to Geode Jira now
> 
> https://issues.apache.org/jira/projects/GEODE/issues
> 
> 
>> On Feb 26, 2020, at 8:41 AM, Derrick Anderson  
>> wrote:
>> 
>> Hi!
>> 
>> I am planning on contributing and would like contributor access in Jira.
>> 
>> My username is deanderson
>> 
>> Thank you so much for your help
> 



Re: [PROPOSAL] include GEODE-7829 in 1.12

2020-02-28 Thread Ernest Burghardt
There appears to be consensus that this is a critical fix.

The following commit has been brought into release/1.12 as the
critical fix for GEODE-7829:

git cherry-pick -x dfa3326e674278f38673c9508a13af752ae90e28

GEODE-7829 has been marked as 'resolved in' 1.12.

Regards
EB



On Fri, Feb 28, 2020 at 1:20 PM Donal Evans  wrote:

> +1
>
> On Fri, Feb 28, 2020 at 1:15 PM Ernest Burghardt 
> wrote:
>
> > yes, it was just a transposition/typo when the defaults were
> > refactored/moved
> >
> > On Fri, Feb 28, 2020 at 1:14 PM Owen Nichols 
> wrote:
> >
> > > +1 for bringing this fix to 1.12 (assuming the change was
> unintentional?)
> > >
> > > Any ideas how much performance improvement this is anticipated to
> > restore?
> > >
> > > > On Feb 28, 2020, at 1:06 PM, Ernest Burghardt 
> > > wrote:
> > > >
> > > > bumping this out to DevList as something went wrong in the mail
> > system...
> > > >
> > > > On 2020/02/28 19:27:42, Bruce Schuchardt 
> > > wrote:
> > > >> During a refactor the default ack-wait-threshold was changed from 15
> > to
> > > 51.  This will affect any deployment that doesn’t set its own
> threshold.
> > > >>
> > > >> The ack-wait-threshold determines how long we wait for a response
> to a
> > > message, such as replication of a put(), before taking some action.
> > > >>
> > >
> > >
> >
>


Re: Discussions about concerns over User API changes

2020-02-28 Thread Kirk Lund
Ok, thanks! I've submitted https://github.com/apache/geode/pull/4748 to
revert the change. I'll be offline for a while so feel free to merge it if
it passes.

Thanks,
Kirk

On Thu, Feb 27, 2020 at 5:19 PM Owen Nichols  wrote:

> While changing a void method to have a return type does not break source
> compatibility, it appears likely to break binary compatibility[1].
>
> So, if you are compiling your client from source, it will compile
> successfully against either Geode 1.12 or Geode 1.13.  But if your client
> was already compiled [against Geode 1.12] and then you upgrade to Geode
> 1.13, without recompiling your client, I your client will throw
> MethodNotFoundException[2].
>
> [1]
> https://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_interfaces_-_API_methods
> [2]
> https://stackoverflow.com/questions/47476509/can-i-change-a-return-type-to-be-a-strict-subtype-and-retain-binary-compatibilit
>
> > On Feb 27, 2020, at 5:09 PM, Kirk Lund  wrote:
> >
> > Remember, if there are any concerns about recent backwards-compatible
> > changes to Geode user APIs, they should be brought on the dev list.
> >
> > Also, backward-compatible changes are by definition safe and ok for a
> user
> > API because it won't break the user's code.
> >
> > Here's an example of a user API that I recently fixed...
> >
> > The ClientRegionFactory is a builder with methods that are supposed to
> > follow fluent-style API (ie, return the ClientRegionFactory instance so
> you
> > can chain the calls).
> >
> > Whoever added setConcurrencyChecksEnabled goofed up and made the return
> > type void. Changing void to ClientRegionFactory is a fully backwards
> > compatible change which corrects the API. Since this fixes the API and
> > doesn't actually change the user API, this should be very safe and
> improves
> > Geode by correcting a broken API:
> >
> > *diff --git
> >
> a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java
> >
> b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
> >
> > *index add35f01a2..2a08307adc 100644*
> >
> > *---
> >
> a/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
> >
> > *+++
> >
> b/geode-core/src/main/java/org/apache/geode/cache/client/ClientRegionFactory.java*
> >
> > @@ -216,7 +216,7 @@ public interface ClientRegionFactory {
> >
> >* @since GemFire 7.0
> >
> >* @param concurrencyChecksEnabled whether to perform concurrency
> checks
> > on operations
> >
> >*/
> >
> > -  void setConcurrencyChecksEnabled(boolean concurrencyChecksEnabled);
> >
> > +  ClientRegionFactory setConcurrencyChecksEnabled(boolean
> > concurrencyChecksEnabled);
> >
> >
> >
> >   /**
> >
> >* Sets the DiskStore name attribute. This causes the region to belong
> > to the DiskStore.
> >
> > *diff --git
> >
> a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java
> >
> b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
> >
> > *index 64256e8f8e..920deba055 100644*
> >
> > *---
> >
> a/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
> >
> > *+++
> >
> b/geode-core/src/main/java/org/apache/geode/cache/client/internal/ClientRegionFactoryImpl.java*
> >
> > @@ -186,8 +186,9 @@ public class ClientRegionFactoryImpl implements
> > ClientRegionFactory
> >
> >   }
> >
> >
> >
> >   @Override
> >
> > -  public void setConcurrencyChecksEnabled(boolean
> > concurrencyChecksEnabled) {
> >
> > +  public ClientRegionFactory setConcurrencyChecksEnabled(boolean
> > concurrencyChecksEnabled) {
> >
> >
> > this.attrsFactory.setConcurrencyChecksEnabled(concurrencyChecksEnabled);
> >
> > +return this;
> >
> >   }
> >
> >
> >
> >   @Override
> >
> > Does anyone have concerns over fixing "void setConcurrencyChecksEnabled"
> > by changing it to "ClientRegionFactory setConcurrencyChecksEnabled"? If
> > there is a concern, is the concern about the change itself or because
> this
> > was fixed without following a more heavy-weight process?
> >
> > Thanks,
> > Kirk
>
>


Re: [PROPOSAL] eliminate file count loophole in PR StressNewTest

2020-02-28 Thread Donal Evans
+1

If the original reason for the 25 file limit was to prevent StressNewTest
taking too long, then the concourse timeout renders the restriction
redundant. Also, I'd think that if anything, the more files a PR changed,
the more interested we would be in running StressNewTest. Allowing it to
just be "green" when we get over 25 changed files seems counterintuitive
with that in mind.

On Fri, Feb 28, 2020 at 12:04 PM Owen Nichols  wrote:

> Currently, StressNewTest will turn green WITHOUT RUNNING ANY TESTS if it
> thinks (often wrongly) that more than 25 test files were touched.
>
> This is both dishonest (it can give a false green) AND totally unnecessary
> (concourse already applies a timeout to StressNewTest, currently set at 6
> hours, so there is no reason not to at least try to run all changed tests,
> they may easily complete in under 6 hours even if many files).
>
> *** I PROPOSE that we should remove this 25-file-free-pass loophole. ***
>
> As always, if anyone is having trouble getting their PR to pass StressNew,
> please bring it up on the dev list and we can discuss the appropriate
> solution on a case-by-case basis (e.g. increasing timeout, fixing the logic
> that identifies changes test files, splitting into multiple PRs, etc).
>
>


Re: Let's Deprecate the SECURITY_UDP_DHALGO Configuration Property

2020-02-28 Thread Udo Kohlmeyer

Yes, the proposal was deprecation notice in 1.12 and remove in 1.13..

That does not leave users much time to react. So if we remove in 1.14, 
then users have at least 2 release cycles before we do it.


--Udo

On 2/28/20 1:11 PM, Owen Nichols wrote:

Udo, are you proposing to deprecate in 1.12 and remove in 1.14?


On Feb 28, 2020, at 1:03 PM, Udo Kohlmeyer  wrote:

+1 to adding the deprecation. But I would prefer that we give users more notice 
than 3 months.

maybe we deprecate it in 1.14

--Udo

On 2/28/20 1:00 PM, Ernest Burghardt wrote:

+1

seems reasonable to do this for 1.12 and be ahead of the game, @Owen would
you please spawn that as a separate release 1.12 thread? thanks, eB

On Fri, Feb 28, 2020 at 11:56 AM Owen Nichols  wrote:


+1

We should have done this as soon as SSL/TLS was ready.  Better late than
never!

While we normally wait until a major release to remove deprecated stuff,
there is some precedent for removing insecure encryption algorithms sooner
(Java has done this even in patch releases).  We should consider getting
the deprecation notice into 1.12 and removing in 1.13, rather than waiting
for 2.0.


On Feb 28, 2020, at 11:42 AM, Bill Burcham 

wrote:

I propose we deprecate Geode’s proprietary UDP message privacy algorithm
based on the Diffie-Hellman key exchange protocol. This would deprecate:

ConfigurationProperties.SECURITY_UDP_DHALGO

String DistributionConfig.getSecurityUDPDHAlgo()

void DistributionConfig.setSecurityUDPDHAlgo(String attValue)
DistributionConfig.SECURITY_UDP_DHALGO_NAME

Additionally we’d have to upate documentation to reflect deprecation.

 From ConfigurationProperties.java:


Application can set this property to valid symmetric key algorithm, to
encrypt udp messages in Geode. Geode will generate symmetric key using
Diffie-Hellman key exchange algorithm between peers. That key further

used

by specified algorithm to encrypt the udp messages.

The property (and the feature) was added mid-2016. Unfortunately it was

not

added as an “experimental” feature, so it cannot simply be removed.

Incidentally, the corresponding property for client-server communication,
SECURITY_CLIENT_DHALGO, is already deprecated. It was deprecated in Geode
1.5 in favor of SSL/TLS.

I am proposing deprecating the feature because:


   1.

   The feature has not proven popular with users.
   2.

   At least one hard-to-reproduce bug exists in the implementation:
   GEODE-6448 . We’ve
   burned a person-week trying to fix the problem (Bruce Schuchardt and

me)

   and it’s not clear how much more time it will take. If we decide to
   deprecate the feature, fixing this problem would be de-prioritized
   accordingly.
   3.

   If we decide, in the future, that UDP message security is required, it
   would be better to implement a standard algorithm such as DTLS
   :
   1.

  Our algorithm provides only message privacy whereas DTLS provides
  privacy, tamper-resistance, and message forgery protection
  2.

  DTLS is a standard
  3.

  There is some support for DTLS in the JDK (JEP-219
   delivered in JDK 9). It’s not a
  complete implementation e.g. guaranteed delivery is a

do-it-yourself kit.

Actually implementing DTLS is out of scope for this proposal. Adding DTLS
would be a significant undertaking.

So, how do you feel about me making a GEODE ticket to deprecate the
SECURITY_UDP_DHALGO configuration property?