[jira] [Commented] (GEODE-8655) Not handling exception on SNIHostName

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225992#comment-17225992
 ] 

ASF GitHub Bot commented on GEODE-8655:
---

mkevo commented on pull request #5669:
URL: https://github.com/apache/geode/pull/5669#issuecomment-721667656


   Hi @bschuchardt ,
   
   I think that this will not also work on your laptop too, you just need to 
add ipv6 address to you machine and follow steps described in the ticket.
   For now, I think it will be good to check if ipv6 is used, if yes ignore for 
it, until someone do this huge change in gfsh/LocatorLauncher as you mentioned.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Not handling exception on SNIHostName
> -
>
> Key: GEODE-8655
> URL: https://issues.apache.org/jira/browse/GEODE-8655
> Project: Geode
>  Issue Type: Bug
>  Components: locator, security
>Affects Versions: 1.13.0
>Reporter: Mario Kevo
>Assignee: Mario Kevo
>Priority: Major
>  Labels: pull-request-available
>
> If we start locator with ipv6 and TLS enabled we got following error for 
> status locator command:
>   
> {quote}mkevo@mkevo-XPS-15-9570:~/apache-geode-1.13.0/bin/locator$ 
> _/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -server -classpath 
> /home/mkevo/apache-geode-1.13.0/lib/geode-core-1.13.0.jar:/home/mkevo/apache-geode-1.13.0/lib/geode-dependencies.jar
>  -Djava.net.preferIPv6Addresses=true 
> -DgemfireSecurityPropertyFile=/home/mkevo/geode-examples/clientSecurity/example_security.properties
>  -Dgemfire.enable-cluster-configuration=true 
> -Dgemfire.load-cluster-configuration-from-dir=false 
> -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true 
> -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 
> org.apache.geode.distributed.LocatorLauncher start locator --port=10334_
>   
>  gfsh>_status locator --dir=/home/mkevo/apache-geode-1.13.0/bin/locator 
> --security-properties-file=/home/mkevo/geode-examples/clientSecurity/example_security.properties_
>  *Locator in /home/mkevo/apache-geode-1.13.0/bin/locator on 
> mkevo-XPS-15-9570[10334] is currently not responding.*
> {quote}
>  
>  From locator logs we found only this:
> {quote}Exception in processing request from fe80:0:0:0:f83e:ce0f:5143:f9ee%2: 
> Read timed out
> {quote}
>  
>  After adding some logs we found the following:
> {quote}{color:#1d1c1d}TcpClient.stop(): exception connecting to locator 
> HostAndPort[/0:0:0:0:0:0:0:0:10334]java.lang.IllegalArgumentException: 
> Contains non-LDH ASCII characters{color}
> {quote}
> **
>  It fails on creating SNIHostName from hostName(_setServerNames_ in 
> SocketCreator.java) as it not handling above exception.
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8547) Command "show missing-disk-stores" not working, when all servers are down

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226017#comment-17226017
 ] 

ASF GitHub Bot commented on GEODE-8547:
---

mivanac commented on pull request #5567:
URL: https://github.com/apache/geode/pull/5567#issuecomment-721699428


   Thanks for the comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Command "show missing-disk-stores" not working, when all servers are down
> -
>
> Key: GEODE-8547
> URL: https://issues.apache.org/jira/browse/GEODE-8547
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: Mario Ivanac
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: needs-review, pull-request-available
>
> If cluster with 2 locators and 2 servers was ungracefully shutdown it can 
> happen that locators that are able to start up are not having most recent 
> data to bring up Cluster Configuration Service.
> If we excute command "show missing-disk-stores" it will not work, since all 
> servers are down,
> so we are stuck in this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8547) Command "show missing-disk-stores" not working, when all servers are down

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226095#comment-17226095
 ] 

ASF GitHub Bot commented on GEODE-8547:
---

jinmeiliao commented on a change in pull request #5567:
URL: https://github.com/apache/geode/pull/5567#discussion_r517420791



##
File path: 
geode-gfsh/src/main/java/org/apache/geode/management/internal/cli/commands/ShowMissingDiskStoreCommand.java
##
@@ -95,7 +96,8 @@ private ResultModel toMissingDiskStoresTabularResult(
 ResultModel result = new ResultModel();
 
 boolean hasMissingDiskStores = missingDiskStores.length != 0;
-boolean hasMissingColocatedRegions = !missingColocatedRegions.isEmpty();
+boolean hasMissingColocatedRegions =

Review comment:
   this variable is not used anymore.

##
File path: 
geode-gfsh/src/main/java/org/apache/geode/management/internal/cli/commands/ShowMissingDiskStoreCommand.java
##
@@ -95,7 +96,8 @@ private ResultModel toMissingDiskStoresTabularResult(
 ResultModel result = new ResultModel();
 
 boolean hasMissingDiskStores = missingDiskStores.length != 0;

Review comment:
   actually you can get rid of this variable and inline this too, to make 
it symetric.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Command "show missing-disk-stores" not working, when all servers are down
> -
>
> Key: GEODE-8547
> URL: https://issues.apache.org/jira/browse/GEODE-8547
> Project: Geode
>  Issue Type: Bug
>  Components: gfsh
>Reporter: Mario Ivanac
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: needs-review, pull-request-available
>
> If cluster with 2 locators and 2 servers was ungracefully shutdown it can 
> happen that locators that are able to start up are not having most recent 
> data to bring up Cluster Configuration Service.
> If we excute command "show missing-disk-stores" it will not work, since all 
> servers are down,
> so we are stuck in this situation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8683) maximum-time-between-pings parameter in GatewayReceiver creation does not have any effect

2020-11-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8683:
--
Labels: pull-request-available  (was: )

> maximum-time-between-pings parameter in GatewayReceiver creation does not 
> have any effect
> -
>
> Key: GEODE-8683
> URL: https://issues.apache.org/jira/browse/GEODE-8683
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Alberto Gomez
>Assignee: Alberto Gomez
>Priority: Major
>  Labels: pull-request-available
>
> The maximum-time-between-pings parameter than can be set at gateway sender 
> creation has no effect, i.e. the value used as maximum time between pings for 
> gateway sender connections to the gateway receiver is either the default 
> value (6) or the one set on the server where the receiver is running.
> The reason is that the ClientHealthMonitor is a server-side singleton that 
> monitors the health of all clients. The value set for this parameter in the 
> ClientHealthMonitor is first set when the server is started and the first 
> Acceptor is created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8683) maximum-time-between-pings parameter in GatewayReceiver creation does not have any effect

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226246#comment-17226246
 ] 

ASF GitHub Bot commented on GEODE-8683:
---

albertogpz opened a new pull request #5701:
URL: https://github.com/apache/geode/pull/5701


   The maximum-time-between-pings set when creating a gateway receiver
   was not honored because the ClientHealthMonitor, that is the singleton
   class monitoring all clients, supported just one value for maximum
   time between pings for all clients. This value is set
   when the server in which the receiver is running is started
   and when the gateway receiver provides a different value
   it is ignored.
   
   With this fix, it is allowed to have different values
   for maximum-time-between-clients for different clients.
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [X] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [X] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [X] Is your initial contribution a single, squashed commit?
   
   - [X] Does `gradlew build` run cleanly?
   
   - [X] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> maximum-time-between-pings parameter in GatewayReceiver creation does not 
> have any effect
> -
>
> Key: GEODE-8683
> URL: https://issues.apache.org/jira/browse/GEODE-8683
> Project: Geode
>  Issue Type: Bug
>  Components: wan
>Reporter: Alberto Gomez
>Assignee: Alberto Gomez
>Priority: Major
>
> The maximum-time-between-pings parameter than can be set at gateway sender 
> creation has no effect, i.e. the value used as maximum time between pings for 
> gateway sender connections to the gateway receiver is either the default 
> value (6) or the one set on the server where the receiver is running.
> The reason is that the ClientHealthMonitor is a server-side singleton that 
> monitors the health of all clients. The value set for this parameter in the 
> ClientHealthMonitor is first set when the server is started and the first 
> Acceptor is created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226272#comment-17226272
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

pdxcodemonkey commented on a change in pull request #682:
URL: https://github.com/apache/geode-native/pull/682#discussion_r517490303



##
File path: clicache/src/DataInput.hpp
##
@@ -663,7 +664,7 @@ namespace Apache
   m_buffer = 
const_cast(nativeptr->currentBufferPosition());
   if ( m_buffer != NULL) {
 m_bufferLength = 
static_cast(nativeptr->getBytesRemaining());
-   }
+  }

Review comment:
   Appears to be okay now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226273#comment-17226273
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

pdxcodemonkey commented on a change in pull request #682:
URL: https://github.com/apache/geode-native/pull/682#discussion_r517493689



##
File path: clicache/src/DataInput.cpp
##
@@ -878,8 +877,6 @@ namespace Apache
 
   void DataInput::Cleanup()
   {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226297#comment-17226297
 ] 

ASF GitHub Bot commented on GEODE-8681:
---

bschuchardt merged pull request #5699:
URL: https://github.com/apache/geode/pull/5699


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226304#comment-17226304
 ] 

ASF subversion and git services commented on GEODE-8681:


Commit 7da8f9b516ac1e2525a1dfc922af7bfb8995f2c6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7da8f9b ]

GEODE-8681: peer-to-peer message loss due to sending connection closing with 
TLS enabled (#5699)

A socket-read could pick up more than one message and a single unwrap()
could decrypt multiple messages.
Normally the engine isn't closed and it reports normal
status from an unwrap() operation, and Connection.processInputBuffer
picks up each message, one by one, from the buffer and dispatches them.
But if the SSLEngine is closed we were ignoring any already-decrypted
data sitting in the unwrapped buffer and instead we were throwing an 
SSLException.

> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226308#comment-17226308
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

lgtm-com[bot] commented on pull request #682:
URL: https://github.com/apache/geode-native/pull/682#issuecomment-721882183


   This pull request **introduces 4 alerts** when merging 
daac2c91545b9c8cb10d729e741658eb463deac2 into 
0d9a99d5e0632de62df17921950cf3f6640efb33 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode-native/rev/pr-60291afa9c24d4b295d3b0886ebe8f339141f43c)
   
   **new alerts:**
   
   * 2 for Call to GC\.Collect\(\)
   * 2 for Useless assignment to local variable



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8546) Colocated regions missing some buckets after restart

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226309#comment-17226309
 ] 

ASF GitHub Bot commented on GEODE-8546:
---

DonalEvans commented on pull request #5590:
URL: https://github.com/apache/geode/pull/5590#issuecomment-721882594


   > @DonalEvans Do you have any capacity to assist with this?
   
   I emailed back and forth with Mario a bit trying to figure out a way to 
reproduce the issue in a DUnit test, but it's not an area of the code I'm 
particularly familiar with so my ability to help was fairly limited. I thought 
that if the assumption that the problem was caused by colocation taking too 
long was correct, then artificially slowing down colocation by using a listener 
to wait a little after each bucket is created could reproduce it.
   
   Since part of the proposed fix seems to be to wait a total of 9000ms for 
colocation to complete in `CreateMIssingBucketTask.java`, it seems like forcing 
colocation to take close to that long should be a sure-fire way to reproduce 
the issue, if the fix is effective. It seems like this wasn't the case though, 
so maybe my idea was off the mark.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Colocated regions missing some buckets after restart
> 
>
> Key: GEODE-8546
> URL: https://issues.apache.org/jira/browse/GEODE-8546
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.11.0, 1.12.0, 1.13.0
>Reporter: Mario Kevo
>Assignee: Mario Kevo
>Priority: Major
>  Labels: pull-request-available
>
> After restart all servers at the same time, some colocation regions missing 
> some buckets.
> This issue exist for a long time and become visible from 1.11.0 by 
> introducing this changes GEODE-7042 .
> How to reproduce the issue:
>  #  Start two locators and two servers
>  #  Create PARTITION_REDUNDANT_PERSISTENT region with redundant-copies=1
>  #  Create few PARTITION_REDUNDANT regions(I used six regions) colocated with 
> persistent region and redundant-copies=1
>  #  Put some entries.
>  #  Restart servers(you can simply run "kill -15 " and then from 
> two terminals start both of them at the same time)
>  #  It will take a time to get server startup finished and for the latest 
> region bucketCount will be lower than expected on one member



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226310#comment-17226310
 ] 

ASF subversion and git services commented on GEODE-8681:


Commit 7da8f9b516ac1e2525a1dfc922af7bfb8995f2c6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7da8f9b ]

GEODE-8681: peer-to-peer message loss due to sending connection closing with 
TLS enabled (#5699)

A socket-read could pick up more than one message and a single unwrap()
could decrypt multiple messages.
Normally the engine isn't closed and it reports normal
status from an unwrap() operation, and Connection.processInputBuffer
picks up each message, one by one, from the buffer and dispatches them.
But if the SSLEngine is closed we were ignoring any already-decrypted
data sitting in the unwrapped buffer and instead we were throwing an 
SSLException.

> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (GEODE-8685) Exporting data causes a ClassNotFoundException

2020-11-04 Thread Anthony Baker (Jira)
Anthony Baker created GEODE-8685:


 Summary: Exporting data causes a ClassNotFoundException
 Key: GEODE-8685
 URL: https://issues.apache.org/jira/browse/GEODE-8685
 Project: Geode
  Issue Type: Task
  Components: regions
Affects Versions: 1.13.0
Reporter: Anthony Baker


See 
[https://lists.apache.org/thread.html/rfa4fc47eb4cb4e75c39d7cb815416bebf2ec233d4db24e37728e922e%40%3Cuser.geode.apache.org%3E.]

 

Report is that exporting data whose values are Classes defined in a deployed 
jar result in a ClassNotFound exception:

{noformat}
[error 2020/10/30 08:54:29.317 PDT  tid=0x41] 
org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
while trying to deserialize cached value.
java.io.IOException: org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
while trying to deserialize cached value.
at 
org.apache.geode.internal.cache.snapshot.WindowedExporter.export(WindowedExporter.java:106)
at 
org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.exportOnMember(RegionSnapshotServiceImpl.java:361)
at 
org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:161)
at 
org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:146)
at 
org.apache.geode.management.internal.cli.functions.ExportDataFunction.executeFunction(ExportDataFunction.java:62)
at 
org.apache.geode.management.cli.CliFunction.execute(CliFunction.java:37)
at 
org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
at 
org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at 
org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442)
at 
org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:377)
at 
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.geode.cache.execute.FunctionException: 
org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
while trying to deserialize cached value.
at 
org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.setException(WindowedExporter.java:383)
at 
org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.addResult(WindowedExporter.java:346)
at 
org.apache.geode.internal.cache.execute.PartitionedRegionFunctionResultSender.lastResult(PartitionedRegionFunctionResultSender.java:195)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.handleException(AbstractExecution.java:502)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:353)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.lambda$executeFunctionOnLocalPRNode$0(AbstractExecution.java:273)
... 6 more
Caused by: org.apache.geode.SerializationException: A ClassNotFoundException 
was thrown while trying to deserialize cached value.
at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2046)
at 
org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2032)
at 
org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:135)
at 
org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:111)
at 
org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:99)
at 
org.apache.geode.internal.cache.EntrySnapshot.getValue(EntrySnapshot.java:129)
at 
org.apache.geode.internal.cache.snapshot.SnapshotPacket$SnapshotRecord.(SnapshotPacket.java:79)
at 
org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportFunction.execute(WindowedExporter.java:197)
at 
org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:328)
... 7 more
Caused by: java.lang.ClassNotFoundException: org.myApp.domain.myClass
at 
org.apache.geode.internal.ClassPathLoader.forName(ClassPathLoader.jav

[jira] [Commented] (GEODE-8685) Exporting data causes a ClassNotFoundException

2020-11-04 Thread Anthony Baker (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226313#comment-17226313
 ] 

Anthony Baker commented on GEODE-8685:
--

I think there are two things to investigate here:

1) Why is the class not resolving?

2) Why is the value being deserialized at all?

> Exporting data causes a ClassNotFoundException
> --
>
> Key: GEODE-8685
> URL: https://issues.apache.org/jira/browse/GEODE-8685
> Project: Geode
>  Issue Type: Task
>  Components: regions
>Affects Versions: 1.13.0
>Reporter: Anthony Baker
>Priority: Major
>
> See 
> [https://lists.apache.org/thread.html/rfa4fc47eb4cb4e75c39d7cb815416bebf2ec233d4db24e37728e922e%40%3Cuser.geode.apache.org%3E.]
>  
> Report is that exporting data whose values are Classes defined in a deployed 
> jar result in a ClassNotFound exception:
> {noformat}
> [error 2020/10/30 08:54:29.317 PDT  tid=0x41] 
> org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> java.io.IOException: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter.export(WindowedExporter.java:106)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.exportOnMember(RegionSnapshotServiceImpl.java:361)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:161)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:146)
> at 
> org.apache.geode.management.internal.cli.functions.ExportDataFunction.executeFunction(ExportDataFunction.java:62)
> at 
> org.apache.geode.management.cli.CliFunction.execute(CliFunction.java:37)
> at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442)
> at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:377)
> at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.setException(WindowedExporter.java:383)
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.addResult(WindowedExporter.java:346)
> at 
> org.apache.geode.internal.cache.execute.PartitionedRegionFunctionResultSender.lastResult(PartitionedRegionFunctionResultSender.java:195)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.handleException(AbstractExecution.java:502)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:353)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.lambda$executeFunctionOnLocalPRNode$0(AbstractExecution.java:273)
> ... 6 more
> Caused by: org.apache.geode.SerializationException: A ClassNotFoundException 
> was thrown while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2046)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2032)
> at 
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:135)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:111)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:99)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getValue(EntrySnapshot.java:129)
>

[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226315#comment-17226315
 ] 

ASF subversion and git services commented on GEODE-8681:


Commit 7da8f9b516ac1e2525a1dfc922af7bfb8995f2c6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7da8f9b ]

GEODE-8681: peer-to-peer message loss due to sending connection closing with 
TLS enabled (#5699)

A socket-read could pick up more than one message and a single unwrap()
could decrypt multiple messages.
Normally the engine isn't closed and it reports normal
status from an unwrap() operation, and Connection.processInputBuffer
picks up each message, one by one, from the buffer and dispatches them.
But if the SSLEngine is closed we were ignoring any already-decrypted
data sitting in the unwrapped buffer and instead we were throwing an 
SSLException.

> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226317#comment-17226317
 ] 

ASF subversion and git services commented on GEODE-8681:


Commit 7da8f9b516ac1e2525a1dfc922af7bfb8995f2c6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7da8f9b ]

GEODE-8681: peer-to-peer message loss due to sending connection closing with 
TLS enabled (#5699)

A socket-read could pick up more than one message and a single unwrap()
could decrypt multiple messages.
Normally the engine isn't closed and it reports normal
status from an unwrap() operation, and Connection.processInputBuffer
picks up each message, one by one, from the buffer and dispatches them.
But if the SSLEngine is closed we were ignoring any already-decrypted
data sitting in the unwrapped buffer and instead we were throwing an 
SSLException.

> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8681) peer-to-peer message loss due to sending connection closing with TLS enabled

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226320#comment-17226320
 ] 

ASF subversion and git services commented on GEODE-8681:


Commit 7da8f9b516ac1e2525a1dfc922af7bfb8995f2c6 in geode's branch 
refs/heads/develop from Bruce Schuchardt
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7da8f9b ]

GEODE-8681: peer-to-peer message loss due to sending connection closing with 
TLS enabled (#5699)

A socket-read could pick up more than one message and a single unwrap()
could decrypt multiple messages.
Normally the engine isn't closed and it reports normal
status from an unwrap() operation, and Connection.processInputBuffer
picks up each message, one by one, from the buffer and dispatches them.
But if the SSLEngine is closed we were ignoring any already-decrypted
data sitting in the unwrapped buffer and instead we were throwing an 
SSLException.

> peer-to-peer message loss due to sending connection closing with TLS enabled
> 
>
> Key: GEODE-8681
> URL: https://issues.apache.org/jira/browse/GEODE-8681
> Project: Geode
>  Issue Type: Bug
>  Components: membership, messaging
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0
>Reporter: Bruce J Schuchardt
>Assignee: Bruce J Schuchardt
>Priority: Major
>  Labels: pull-request-available, release-blocker
>
> We have observed message loss when TLS is enabled and a distributed lock is 
> released right after sending a message that doesn't require acknowledgement 
> if the sending socket is immediately closed. The closing of sockets 
> immediately after sending a message is frequently seen in function execution 
> threads or server-side application threads that use this pattern:
> {code:java}
>  try {
> DistributedSystem.setThreadsSocketPolicy(false);
> acquireDistributedLock(lockName);
> (perform one or more cache operations)
>   } finally {
> distLockService.unlock(lockName);
> DistributedSystem.releaseThreadsSockets(); // closes the socket
>   }
> {code}
> The fault seems to be in NioSSLEngine.unwrap(), which throws an 
> SSLException() if it finds the SSLEngine is closed even though there is valid 
> data in its decrypt buffer.  It shouldn't throw an exception in that case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8676) Update bookbindery to latest

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226326#comment-17226326
 ] 

ASF subversion and git services commented on GEODE-8676:


Commit 9279098352e5c6440cade1196b9b99dcf89e90c5 in geode-native's branch 
refs/heads/develop from M. Oleske
[ https://gitbox.apache.org/repos/asf?p=geode-native.git;h=9279098 ]

GEODE-8676: Update Bookbindery (#683)

* Bump bookbindery from 10.1.14 to 10.1.15 in /docs/geode-native-book-cpp
Authored-by: M. Oleske 

> Update bookbindery to latest
> 
>
> Key: GEODE-8676
> URL: https://issues.apache.org/jira/browse/GEODE-8676
> Project: Geode
>  Issue Type: Improvement
>  Components: docs, native client
>Reporter: Michael Oleske
>Priority: Major
>  Labels: pull-request-available
>
> [Bookbinder|https://github.com/pivotal-cf/bookbinder/releases] has a new 
> release and we should keep the tools we use to build our docs up to date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8676) Update bookbindery to latest

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226327#comment-17226327
 ] 

ASF GitHub Bot commented on GEODE-8676:
---

davebarnes97 merged pull request #683:
URL: https://github.com/apache/geode-native/pull/683


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update bookbindery to latest
> 
>
> Key: GEODE-8676
> URL: https://issues.apache.org/jira/browse/GEODE-8676
> Project: Geode
>  Issue Type: Improvement
>  Components: docs, native client
>Reporter: Michael Oleske
>Priority: Major
>  Labels: pull-request-available
>
> [Bookbinder|https://github.com/pivotal-cf/bookbinder/releases] has a new 
> release and we should keep the tools we use to build our docs up to date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8685) Exporting data causes a ClassNotFoundException

2020-11-04 Thread Jinmei Liao (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinmei Liao updated GEODE-8685:
---
Labels: GeodeOperationAPI  (was: )

> Exporting data causes a ClassNotFoundException
> --
>
> Key: GEODE-8685
> URL: https://issues.apache.org/jira/browse/GEODE-8685
> Project: Geode
>  Issue Type: Task
>  Components: regions
>Affects Versions: 1.13.0
>Reporter: Anthony Baker
>Priority: Major
>  Labels: GeodeOperationAPI
>
> See 
> [https://lists.apache.org/thread.html/rfa4fc47eb4cb4e75c39d7cb815416bebf2ec233d4db24e37728e922e%40%3Cuser.geode.apache.org%3E.]
>  
> Report is that exporting data whose values are Classes defined in a deployed 
> jar result in a ClassNotFound exception:
> {noformat}
> [error 2020/10/30 08:54:29.317 PDT  tid=0x41] 
> org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> java.io.IOException: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter.export(WindowedExporter.java:106)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.exportOnMember(RegionSnapshotServiceImpl.java:361)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:161)
> at 
> org.apache.geode.internal.cache.snapshot.RegionSnapshotServiceImpl.save(RegionSnapshotServiceImpl.java:146)
> at 
> org.apache.geode.management.internal.cli.functions.ExportDataFunction.executeFunction(ExportDataFunction.java:62)
> at 
> org.apache.geode.management.cli.CliFunction.execute(CliFunction.java:37)
> at 
> org.apache.geode.internal.cache.MemberFunctionStreamingMessage.process(MemberFunctionStreamingMessage.java:201)
> at 
> org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376)
> at 
> org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442)
> at 
> org.apache.geode.distributed.internal.ClusterOperationExecutors.doFunctionExecutionThread(ClusterOperationExecutors.java:377)
> at 
> org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.geode.cache.execute.FunctionException: 
> org.apache.geode.SerializationException: A ClassNotFoundException was thrown 
> while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.setException(WindowedExporter.java:383)
> at 
> org.apache.geode.internal.cache.snapshot.WindowedExporter$WindowedExportCollector.addResult(WindowedExporter.java:346)
> at 
> org.apache.geode.internal.cache.execute.PartitionedRegionFunctionResultSender.lastResult(PartitionedRegionFunctionResultSender.java:195)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.handleException(AbstractExecution.java:502)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:353)
> at 
> org.apache.geode.internal.cache.execute.AbstractExecution.lambda$executeFunctionOnLocalPRNode$0(AbstractExecution.java:273)
> ... 6 more
> Caused by: org.apache.geode.SerializationException: A ClassNotFoundException 
> was thrown while trying to deserialize cached value.
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2046)
> at 
> org.apache.geode.internal.cache.EntryEventImpl.deserialize(EntryEventImpl.java:2032)
> at 
> org.apache.geode.internal.cache.VMCachedDeserializable.getDeserializedValue(VMCachedDeserializable.java:135)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:111)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getRawValue(EntrySnapshot.java:99)
> at 
> org.apache.geode.internal.cache.EntrySnapshot.getValue(EntrySnapshot.java:129)
> at 
> org.apache.geode.internal.cache.snapshot.SnapshotPacket$SnapshotRecord.(SnapshotPacket.java:79)
>

[jira] [Commented] (GEODE-8672) Concurrent transactional destroy with GII could cause an entry to be removed and version information to be lost

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226345#comment-17226345
 ] 

ASF GitHub Bot commented on GEODE-8672:
---

pivotal-eshu opened a new pull request #5702:
URL: https://github.com/apache/geode/pull/5702


   … (#5691)"
   
   This reverts commit e695938dff4b39f1755c707e81e1eb7e2e143fe0.
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrent transactional destroy with GII could cause an entry to be removed 
> and version information to be lost
> ---
>
> Key: GEODE-8672
> URL: https://issues.apache.org/jira/browse/GEODE-8672
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.1.0
>Reporter: Eric Shu
>Assignee: Eric Shu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> In a newly rebalanced bucket, while GII is in progress, a transactional 
> destroy is applied to cache. There is a logic that it should be in token mode 
> and leaves the entry as a Destroyed token, even though the version tag of the 
> entry indicates that it has the correct version.
> However, at end of the GII, there is a 
> cleanUpDestroyedTokensAndMarkGIIComplete method removes all the destroyed 
> entries – this wipes off the entry version tag information and cause the 
> subsequent creates starts fresh with new version tags.
> This could leads to client server data inconsistency as the newly created 
> entries will be ignored by the clients as the newly created entry has lower 
> version number while client has high ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226348#comment-17226348
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

lgtm-com[bot] commented on pull request #682:
URL: https://github.com/apache/geode-native/pull/682#issuecomment-721919329


   This pull request **introduces 4 alerts** when merging 
0588ef5947fe3875c7f2cb90732179ecbada8bfb into 
9279098352e5c6440cade1196b9b99dcf89e90c5 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode-native/rev/pr-ed3924809465bbcd002b9a6d79ae4ffd394a4423)
   
   **new alerts:**
   
   * 2 for Call to GC\.Collect\(\)
   * 2 for Useless assignment to local variable



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8676) Update bookbindery to latest

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226361#comment-17226361
 ] 

ASF GitHub Bot commented on GEODE-8676:
---

davebarnes97 opened a new pull request #685:
URL: https://github.com/apache/geode-native/pull/685


   One change that I think should have been included in the previous PR for 
this ticket.
   Adds a `bundle exec` prefix before `rackup` in the view-docs.sh script. 
Should improve the user experience.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update bookbindery to latest
> 
>
> Key: GEODE-8676
> URL: https://issues.apache.org/jira/browse/GEODE-8676
> Project: Geode
>  Issue Type: Improvement
>  Components: docs, native client
>Reporter: Michael Oleske
>Priority: Major
>  Labels: pull-request-available
>
> [Bookbinder|https://github.com/pivotal-cf/bookbinder/releases] has a new 
> release and we should keep the tools we use to build our docs up to date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226370#comment-17226370
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

codecov-io edited a comment on pull request #682:
URL: https://github.com/apache/geode-native/pull/682#issuecomment-719112394


   # [Codecov](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=h1) 
Report
   > Merging 
[#682](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=desc) into 
[develop](https://codecov.io/gh/apache/geode-native/commit/0d9a99d5e0632de62df17921950cf3f6640efb33?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/geode-native/pull/682/graphs/tree.svg?width=650&height=150&src=pr&token=plpAqoqGag)](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   develop #682  +/-   ##
   ===
   - Coverage74.04%   74.02%   -0.02% 
   ===
 Files  644  644  
 Lines5118951189  
   ===
   - Hits 3790337894   -9 
   - Misses   1328613295   +9 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...test/testThinClientPoolExecuteHAFunctionPrSHOP.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvaW50ZWdyYXRpb24tdGVzdC90ZXN0VGhpbkNsaWVudFBvb2xFeGVjdXRlSEFGdW5jdGlvblByU0hPUC5jcHA=)
 | `91.20% <0.00%> (-3.71%)` | :arrow_down: |
   | 
[cppcache/src/ThinClientRedundancyManager.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL1RoaW5DbGllbnRSZWR1bmRhbmN5TWFuYWdlci5jcHA=)
 | `75.78% <0.00%> (-0.63%)` | :arrow_down: |
   | 
[cppcache/src/ClientMetadataService.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL0NsaWVudE1ldGFkYXRhU2VydmljZS5jcHA=)
 | `62.24% <0.00%> (-0.46%)` | :arrow_down: |
   | 
[cppcache/src/ExecutionImpl.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL0V4ZWN1dGlvbkltcGwuY3Bw)
 | `68.07% <0.00%> (-0.39%)` | :arrow_down: |
   | 
[cppcache/src/ThinClientPoolDM.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL1RoaW5DbGllbnRQb29sRE0uY3Bw)
 | `76.23% <0.00%> (-0.15%)` | :arrow_down: |
   | 
[cppcache/src/ThinClientRegion.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL1RoaW5DbGllbnRSZWdpb24uY3Bw)
 | `56.04% <0.00%> (-0.06%)` | :arrow_down: |
   | 
[cppcache/src/TcrEndpoint.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL1RjckVuZHBvaW50LmNwcA==)
 | `55.11% <0.00%> (+0.56%)` | :arrow_up: |
   | 
[cppcache/src/TcrConnection.cpp](https://codecov.io/gh/apache/geode-native/pull/682/diff?src=pr&el=tree#diff-Y3BwY2FjaGUvc3JjL1RjckNvbm5lY3Rpb24uY3Bw)
 | `73.27% <0.00%> (+0.78%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=footer). 
Last update 
[0d9a99d...69f5a49](https://codecov.io/gh/apache/geode-native/pull/682?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226369#comment-17226369
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

pivotal-jbarrett commented on a change in pull request #682:
URL: https://github.com/apache/geode-native/pull/682#discussion_r517595960



##
File path: clicache/src/DataInput.cpp
##
@@ -93,8 +93,9 @@ namespace Apache
 if (buffer != nullptr && buffer->Length > 0) {
   _GF_MG_EXCEPTION_TRY2
 
-System::Int32 len = buffer->Length;
-  _GEODE_NEW(m_buffer, System::Byte[len]);
+  System::Int32 len = buffer->Length;

Review comment:
   auto?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226371#comment-17226371
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

lgtm-com[bot] commented on pull request #682:
URL: https://github.com/apache/geode-native/pull/682#issuecomment-721943662


   This pull request **introduces 4 alerts** when merging 
69f5a49c1d86d5cb52cb6fe6ccbca5e27c87 into 
9279098352e5c6440cade1196b9b99dcf89e90c5 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode-native/rev/pr-9196608f3c466b1421ff234d7e093fcfb418615c)
   
   **new alerts:**
   
   * 2 for Call to GC\.Collect\(\)
   * 2 for Useless assignment to local variable



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226376#comment-17226376
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

pdxcodemonkey merged pull request #682:
URL: https://github.com/apache/geode-native/pull/682


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226375#comment-17226375
 ] 

ASF subversion and git services commented on GEODE-8647:


Commit 3ae26364750ed799b9c24e065d08d145129166b5 in geode-native's branch 
refs/heads/develop from Blake Bender
[ https://gitbox.apache.org/repos/asf?p=geode-native.git;h=3ae2636 ]

GEODE-8647: Stop leaking buffer in CLI DataInput  (#682)

* Stop leaking buffer in CLI DataInput when we have to copy incoming buffer
* Add CLI integration test to verify leak is fixed.
* Remove no-longer-used Cleanup method from DataInput
* Specify LGTM warnings to disable in test code

Co-authored-by: Jacob Barrett 

> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (GEODE-8674) CLI DataInput object leaks internal buffer when allocating ctor is called

2020-11-04 Thread Blake Bender (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Bender closed GEODE-8674.
---

> CLI DataInput object leaks internal buffer when allocating ctor is called
> -
>
> Key: GEODE-8674
> URL: https://issues.apache.org/jira/browse/GEODE-8674
> Project: Geode
>  Issue Type: Improvement
>  Components: native client
>Reporter: Blake Bender
>Assignee: Blake Bender
>Priority: Major
> Fix For: 1.14.0
>
>
> The CLI DataInput object has two ctors, one of which copies the passed-in 
> buffer parameter via new[] and one of which doesn't.  In the event that the 
> former is called, the buffer is leaked when the object is deleted/Disposed.  
> Here's the current code for CLI `DataInput::~DataInput`:
> ```
> ~DataInput( ) \{ Cleanup(); }
> ```
> And the code for `DataInput::Cleanup`:
> ```
>       void DataInput::Cleanup()
>       {
>         //TODO:
>         //GF_SAFE_DELETE_ARRAY(m_buffer);
>       }
> ```
> So apparently this bug has been known for some time (?!?), but has never been 
> fixed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (GEODE-8674) CLI DataInput object leaks internal buffer when allocating ctor is called

2020-11-04 Thread Blake Bender (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Bender resolved GEODE-8674.
-
Fix Version/s: 1.14.0
   Resolution: Fixed

> CLI DataInput object leaks internal buffer when allocating ctor is called
> -
>
> Key: GEODE-8674
> URL: https://issues.apache.org/jira/browse/GEODE-8674
> Project: Geode
>  Issue Type: Improvement
>  Components: native client
>Reporter: Blake Bender
>Assignee: Blake Bender
>Priority: Major
> Fix For: 1.14.0
>
>
> The CLI DataInput object has two ctors, one of which copies the passed-in 
> buffer parameter via new[] and one of which doesn't.  In the event that the 
> former is called, the buffer is leaked when the object is deleted/Disposed.  
> Here's the current code for CLI `DataInput::~DataInput`:
> ```
> ~DataInput( ) \{ Cleanup(); }
> ```
> And the code for `DataInput::Cleanup`:
> ```
>       void DataInput::Cleanup()
>       {
>         //TODO:
>         //GF_SAFE_DELETE_ARRAY(m_buffer);
>       }
> ```
> So apparently this bug has been known for some time (?!?), but has never been 
> fixed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8647) Support using multiple DistributedMap Rules in one test

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226380#comment-17226380
 ] 

ASF GitHub Bot commented on GEODE-8647:
---

lgtm-com[bot] commented on pull request #682:
URL: https://github.com/apache/geode-native/pull/682#issuecomment-721966498


   This pull request **introduces 4 alerts** when merging 
0e9463d69e7be5455a351eae23b87cef9b2382ac into 
9279098352e5c6440cade1196b9b99dcf89e90c5 - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode-native/rev/pr-a03165f373b94fecca20198e049a9105fc55bcb8)
   
   **new alerts:**
   
   * 2 for Call to GC\.Collect\(\)
   * 2 for Useless assignment to local variable



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support using multiple DistributedMap Rules in one test
> ---
>
> Key: GEODE-8647
> URL: https://issues.apache.org/jira/browse/GEODE-8647
> Project: Geode
>  Issue Type: Wish
>  Components: tests
>Reporter: Kirk Lund
>Assignee: Kirk Lund
>Priority: Major
>  Labels: pull-request-available
>
> Support using multiple DistributedMap Rules in one test. Right now the Rule 
> only supports having one instance in a test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8191) MemberMXBeanDistributedTest.testBucketCount fails intermittently

2020-11-04 Thread Sarah Abbey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226383#comment-17226383
 ] 

Sarah Abbey commented on GEODE-8191:


Failed again here: 
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/601

> MemberMXBeanDistributedTest.testBucketCount fails intermittently
> 
>
> Key: GEODE-8191
> URL: https://issues.apache.org/jira/browse/GEODE-8191
> Project: Geode
>  Issue Type: Bug
>  Components: jmx, tests
>Reporter: Kirk Lund
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: flaky, pull-request-available
> Fix For: 1.14.0
>
>
> This appears to be a flaky test related to GEODE-7963 which was resolved by 
> Mario Ivanac so I've assigned the ticket to him.
> {noformat}
> org.apache.geode.management.MemberMXBeanDistributedTest > testBucketCount 
> FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.MemberMXBeanDistributedTest Expected bucket count 
> is 4000, and actual count is 3750 expected:<3750> but was:<4000> within 5 
> minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.testBucketCount(MemberMXBeanDistributedTest.java:102)
> Caused by:
> java.lang.AssertionError: Expected bucket count is 4000, and actual 
> count is 3750 expected:<3750> but was:<4000>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.lambda$testBucketCount$1(MemberMXBeanDistributedTest.java:107)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8191) MemberMXBeanDistributedTest.testBucketCount fails intermittently

2020-11-04 Thread Geode Integration (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226384#comment-17226384
 ] 

Geode Integration commented on GEODE-8191:
--

Seen in [DistributedTestOpenJDK8 
#601|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK8/builds/601]
 ... see [test 
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0465/test-results/distributedTest/1604520164/]
 or download 
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.14.0-build.0465/test-artifacts/1604520164/distributedtestfiles-OpenJDK8-1.14.0-build.0465.tgz].

> MemberMXBeanDistributedTest.testBucketCount fails intermittently
> 
>
> Key: GEODE-8191
> URL: https://issues.apache.org/jira/browse/GEODE-8191
> Project: Geode
>  Issue Type: Bug
>  Components: jmx, tests
>Reporter: Kirk Lund
>Assignee: Mario Ivanac
>Priority: Major
>  Labels: flaky, pull-request-available
> Fix For: 1.14.0
>
>
> This appears to be a flaky test related to GEODE-7963 which was resolved by 
> Mario Ivanac so I've assigned the ticket to him.
> {noformat}
> org.apache.geode.management.MemberMXBeanDistributedTest > testBucketCount 
> FAILED
> org.awaitility.core.ConditionTimeoutException: Assertion condition 
> defined as a lambda expression in 
> org.apache.geode.management.MemberMXBeanDistributedTest Expected bucket count 
> is 4000, and actual count is 3750 expected:<3750> but was:<4000> within 5 
> minutes.
> at 
> org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:165)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:119)
> at 
> org.awaitility.core.AssertionCondition.await(AssertionCondition.java:31)
> at 
> org.awaitility.core.ConditionFactory.until(ConditionFactory.java:895)
> at 
> org.awaitility.core.ConditionFactory.untilAsserted(ConditionFactory.java:679)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.testBucketCount(MemberMXBeanDistributedTest.java:102)
> Caused by:
> java.lang.AssertionError: Expected bucket count is 4000, and actual 
> count is 3750 expected:<3750> but was:<4000>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at 
> org.apache.geode.management.MemberMXBeanDistributedTest.lambda$testBucketCount$1(MemberMXBeanDistributedTest.java:107)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8672) Concurrent transactional destroy with GII could cause an entry to be removed and version information to be lost

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226390#comment-17226390
 ] 

ASF GitHub Bot commented on GEODE-8672:
---

pivotal-eshu merged pull request #5702:
URL: https://github.com/apache/geode/pull/5702


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrent transactional destroy with GII could cause an entry to be removed 
> and version information to be lost
> ---
>
> Key: GEODE-8672
> URL: https://issues.apache.org/jira/browse/GEODE-8672
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.1.0
>Reporter: Eric Shu
>Assignee: Eric Shu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> In a newly rebalanced bucket, while GII is in progress, a transactional 
> destroy is applied to cache. There is a logic that it should be in token mode 
> and leaves the entry as a Destroyed token, even though the version tag of the 
> entry indicates that it has the correct version.
> However, at end of the GII, there is a 
> cleanUpDestroyedTokensAndMarkGIIComplete method removes all the destroyed 
> entries – this wipes off the entry version tag information and cause the 
> subsequent creates starts fresh with new version tags.
> This could leads to client server data inconsistency as the newly created 
> entries will be ignored by the clients as the newly created entry has lower 
> version number while client has high ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8672) Concurrent transactional destroy with GII could cause an entry to be removed and version information to be lost

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226392#comment-17226392
 ] 

ASF subversion and git services commented on GEODE-8672:


Commit 7367d17e3817fc41666d471c5eb4d0df0d33c18b in geode's branch 
refs/heads/develop from Eric Shu
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7367d17 ]

Revert "GEODE-8672: No need in token mode if concurrencyChecksEnabled (#5691)" 
(#5702)

This reverts commit e695938dff4b39f1755c707e81e1eb7e2e143fe0.

> Concurrent transactional destroy with GII could cause an entry to be removed 
> and version information to be lost
> ---
>
> Key: GEODE-8672
> URL: https://issues.apache.org/jira/browse/GEODE-8672
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.1.0
>Reporter: Eric Shu
>Assignee: Eric Shu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> In a newly rebalanced bucket, while GII is in progress, a transactional 
> destroy is applied to cache. There is a logic that it should be in token mode 
> and leaves the entry as a Destroyed token, even though the version tag of the 
> entry indicates that it has the correct version.
> However, at end of the GII, there is a 
> cleanUpDestroyedTokensAndMarkGIIComplete method removes all the destroyed 
> entries – this wipes off the entry version tag information and cause the 
> subsequent creates starts fresh with new version tags.
> This could leads to client server data inconsistency as the newly created 
> entries will be ignored by the clients as the newly created entry has lower 
> version number while client has high ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8672) Concurrent transactional destroy with GII could cause an entry to be removed and version information to be lost

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226391#comment-17226391
 ] 

ASF subversion and git services commented on GEODE-8672:


Commit 7367d17e3817fc41666d471c5eb4d0df0d33c18b in geode's branch 
refs/heads/develop from Eric Shu
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=7367d17 ]

Revert "GEODE-8672: No need in token mode if concurrencyChecksEnabled (#5691)" 
(#5702)

This reverts commit e695938dff4b39f1755c707e81e1eb7e2e143fe0.

> Concurrent transactional destroy with GII could cause an entry to be removed 
> and version information to be lost
> ---
>
> Key: GEODE-8672
> URL: https://issues.apache.org/jira/browse/GEODE-8672
> Project: Geode
>  Issue Type: Bug
>  Components: regions
>Affects Versions: 1.1.0
>Reporter: Eric Shu
>Assignee: Eric Shu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> In a newly rebalanced bucket, while GII is in progress, a transactional 
> destroy is applied to cache. There is a logic that it should be in token mode 
> and leaves the entry as a Destroyed token, even though the version tag of the 
> entry indicates that it has the correct version.
> However, at end of the GII, there is a 
> cleanUpDestroyedTokensAndMarkGIIComplete method removes all the destroyed 
> entries – this wipes off the entry version tag information and cause the 
> subsequent creates starts fresh with new version tags.
> This could leads to client server data inconsistency as the newly created 
> entries will be ignored by the clients as the newly created entry has lower 
> version number while client has high ones.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8676) Update bookbindery to latest

2020-11-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226404#comment-17226404
 ] 

ASF subversion and git services commented on GEODE-8676:


Commit 07910325960691ab6774bedbf6e1a96f693e85d1 in geode-native's branch 
refs/heads/develop from Dave Barnes
[ https://gitbox.apache.org/repos/asf?p=geode-native.git;h=0791032 ]

GEODE-8676: Update Bookbindery (#685)



> Update bookbindery to latest
> 
>
> Key: GEODE-8676
> URL: https://issues.apache.org/jira/browse/GEODE-8676
> Project: Geode
>  Issue Type: Improvement
>  Components: docs, native client
>Reporter: Michael Oleske
>Priority: Major
>  Labels: pull-request-available
>
> [Bookbinder|https://github.com/pivotal-cf/bookbinder/releases] has a new 
> release and we should keep the tools we use to build our docs up to date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8676) Update bookbindery to latest

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226405#comment-17226405
 ] 

ASF GitHub Bot commented on GEODE-8676:
---

moleske merged pull request #685:
URL: https://github.com/apache/geode-native/pull/685


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update bookbindery to latest
> 
>
> Key: GEODE-8676
> URL: https://issues.apache.org/jira/browse/GEODE-8676
> Project: Geode
>  Issue Type: Improvement
>  Components: docs, native client
>Reporter: Michael Oleske
>Priority: Major
>  Labels: pull-request-available
>
> [Bookbinder|https://github.com/pivotal-cf/bookbinder/releases] has a new 
> release and we should keep the tools we use to build our docs up to date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8626) Omitting field-mapping tag of cache.xml when using Simple JDBC Connector

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226421#comment-17226421
 ] 

ASF GitHub Bot commented on GEODE-8626:
---

jchen21 commented on pull request #5637:
URL: https://github.com/apache/geode/pull/5637#issuecomment-722021982


   Reviewing the code.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Omitting field-mapping tag of cache.xml when using Simple JDBC Connector
> 
>
> Key: GEODE-8626
> URL: https://issues.apache.org/jira/browse/GEODE-8626
> Project: Geode
>  Issue Type: Improvement
>  Components: jdbc
>Reporter: Masaki Yamakawa
>Priority: Minor
>  Labels: pull-request-available
>
> When configuring Simple JDBC Connector with gfsh, I don't need to create 
> field-mapping, the default field-mapping will be created from pdx and table 
> meta data.
> On the other hand, when using cache.xml(cluster.xml), pdx and table meta data 
> cannot be used, and field-mapping must be described in cache.xml.
> I would like to create field-mapping defaults based on pdx and table meta 
> data when using cache.xml.
> If field-mapping is specified in cache.xml, the xml setting has priority, and 
> only if there are no field-mapping tags.
> cache.xml will be as follows:
> {code:java}
> 
>  data-source="TestDataSource"
> table="employees"
> pdx-name="org.apache.geode.connectors.jdbc.Employee"
> ids="id">
> 
> 
> 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8686) Tombstone removal optimization during GII could cause deadlock

2020-11-04 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans updated GEODE-8686:
---
Description: 
Similar to the issue described in GEODE-6526, if the condition in the below if 
statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a call 
to {{AbstractRegionMap.removeTombstone()}} will be triggered that could lead to 
deadlock between the calling thread and a Tombstone GC thread calling 
{{TombstoneService.gcTombstones()}}. 
{code:java}
if (owner.getServerProxy() == null && 
owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), 
entryVersion.getRegionVersion())) { 
  // the received tombstone has already been reaped, so don't retain it 
  if (owner.getIndexManager() != null) { 
owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, 
IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
  } 
  removeTombstone(oldRe, entryVersion, false, false); 
  return false; 
} else { 
  owner.scheduleTombstone(oldRe, entryVersion); 
  lruEntryDestroy(oldRe); 
}
{code}
The proposed change is to remove this if statement and allow the old tombstone 
to be collected later by calling {{scheduleTombstone()}} in all cases. The call 
to {{AbstractRegionMap.removeTombstone()}} in 
{{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to 
allow immediate removal of tombstones that we know have already been collected 
on other members, but since the conditions to trigger it are rare (the old 
entry must be a tombstone, the new entry received during GII must be a 
tombstone with a newer version, and we must have already collected a tombstone 
with a newer version than the new entry) and the overhead of scheduling a 
tombstone to be collected is comparatively low, the performance impact of 
removing this optimization in favour of simply scheduling the tombstone to be 
collected in all cases should be insignificant.

The solution to the deadlock observed in GEODE-6526 was also to remove the call 
to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be 
collected later and did not result in any unwanted behaviour, so the proposed 
fix should be similarly low-impact.

Also of note is that with this proposed change, there will be no calls to 
{{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} 
class, which should ensure that other deadlocks involving this method are not 
possible.

  was:
Similar to the issue described in GEODE-6526, if the condition in the below if 
statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a call 
to {{AbstractRegionMap.removeTombstone()}} will be triggered that could lead to 
deadlock between the calling thread and a Tombstone GC thread calling 
{{TombstoneService.gcTombstones()}}. 
{code:java}
if (owner.getServerProxy() == null && 
owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), 
entryVersion.getRegionVersion())) { 
  // the received tombstone has already been reaped, so don't retain it 
  if (owner.getIndexManager() != null) { 
owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, 
IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
  } 
  removeTombstone(oldRe, entryVersion, false, false); 
  return false; 
} else { 
  owner.scheduleTombstone(oldRe, entryVersion); 
  lruEntryDestroy(oldRe); 
}
{code}
The proposed change is to remove this if statement and allow the old tombstone 
to be collected later by calling {{scheduleTombstone()}} in all cases{{.}} The 
call to {{AbstractRegionMap.removeTombstone()}} in 
{{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to 
allow immediate removal of tombstones that we know have already been collected 
on other members, but since the conditions to trigger it are rare (the old 
entry must be a tombstone, the new entry received during GII must be a 
tombstone with a newer version, and we must have already collected a tombstone 
with a newer version than the new entry) and the overhead of scheduling a 
tombstone to be collected is comparatively low, the performance impact of 
removing this optimization in favour of simply scheduling the tombstone to be 
collected in all cases should be insignificant.

The solution to the deadlock observed in GEODE-6526 was also to remove the call 
to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be 
collected later and did not result in any unwanted behaviour, so the proposed 
fix should be similarly low-impact.

Also of note is that with this proposed change, there will be no calls to 
{{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} 
class, which should ensure that other deadlocks involving this method are not 
possible.


> Tombstone removal optimization during GII could cause deadlock
> --
>
> 

[jira] [Created] (GEODE-8686) Tombstone removal optimization during GII could cause deadlock

2020-11-04 Thread Donal Evans (Jira)
Donal Evans created GEODE-8686:
--

 Summary: Tombstone removal optimization during GII could cause 
deadlock
 Key: GEODE-8686
 URL: https://issues.apache.org/jira/browse/GEODE-8686
 Project: Geode
  Issue Type: Improvement
Affects Versions: 1.13.0, 1.12.0, 1.11.0, 1.10.0, 1.14.0
Reporter: Donal Evans


Similar to the issue described in GEODE-6526, if the condition in the below if 
statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a call 
to {{AbstractRegionMap.removeTombstone()}} will be triggered that could lead to 
deadlock between the calling thread and a Tombstone GC thread calling 
{{TombstoneService.gcTombstones()}}. 
{code:java}
if (owner.getServerProxy() == null && 
owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), 
entryVersion.getRegionVersion())) { 
  // the received tombstone has already been reaped, so don't retain it 
  if (owner.getIndexManager() != null) { 
owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, 
IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
  } 
  removeTombstone(oldRe, entryVersion, false, false); 
  return false; 
} else { 
  owner.scheduleTombstone(oldRe, entryVersion); 
  lruEntryDestroy(oldRe); 
}
{code}
The proposed change is to remove this if statement and allow the old tombstone 
to be collected later by calling {{scheduleTombstone()}} in all cases{{.}} The 
call to {{AbstractRegionMap.removeTombstone()}} in 
{{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to 
allow immediate removal of tombstones that we know have already been collected 
on other members, but since the conditions to trigger it are rare (the old 
entry must be a tombstone, the new entry received during GII must be a 
tombstone with a newer version, and we must have already collected a tombstone 
with a newer version than the new entry) and the overhead of scheduling a 
tombstone to be collected is comparatively low, the performance impact of 
removing this optimization in favour of simply scheduling the tombstone to be 
collected in all cases should be insignificant.

The solution to the deadlock observed in GEODE-6526 was also to remove the call 
to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be 
collected later and did not result in any unwanted behaviour, so the proposed 
fix should be similarly low-impact.

Also of note is that with this proposed change, there will be no calls to 
{{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} 
class, which should ensure that other deadlocks involving this method are not 
possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (GEODE-8686) Tombstone removal optimization during GII could cause deadlock

2020-11-04 Thread Donal Evans (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Donal Evans reassigned GEODE-8686:
--

Assignee: Donal Evans

> Tombstone removal optimization during GII could cause deadlock
> --
>
> Key: GEODE-8686
> URL: https://issues.apache.org/jira/browse/GEODE-8686
> Project: Geode
>  Issue Type: Improvement
>Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0
>Reporter: Donal Evans
>Assignee: Donal Evans
>Priority: Major
>
> Similar to the issue described in GEODE-6526, if the condition in the below 
> if statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a 
> call to {{AbstractRegionMap.removeTombstone()}} will be triggered that could 
> lead to deadlock between the calling thread and a Tombstone GC thread calling 
> {{TombstoneService.gcTombstones()}}. 
> {code:java}
> if (owner.getServerProxy() == null && 
> owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), 
> entryVersion.getRegionVersion())) { 
>   // the received tombstone has already been reaped, so don't retain it 
>   if (owner.getIndexManager() != null) { 
> owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, 
> IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
>   } 
>   removeTombstone(oldRe, entryVersion, false, false); 
>   return false; 
> } else { 
>   owner.scheduleTombstone(oldRe, entryVersion); 
>   lruEntryDestroy(oldRe); 
> }
> {code}
> The proposed change is to remove this if statement and allow the old 
> tombstone to be collected later by calling {{scheduleTombstone()}} in all 
> cases. The call to {{AbstractRegionMap.removeTombstone()}} in 
> {{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to 
> allow immediate removal of tombstones that we know have already been 
> collected on other members, but since the conditions to trigger it are rare 
> (the old entry must be a tombstone, the new entry received during GII must be 
> a tombstone with a newer version, and we must have already collected a 
> tombstone with a newer version than the new entry) and the overhead of 
> scheduling a tombstone to be collected is comparatively low, the performance 
> impact of removing this optimization in favour of simply scheduling the 
> tombstone to be collected in all cases should be insignificant.
> The solution to the deadlock observed in GEODE-6526 was also to remove the 
> call to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be 
> collected later and did not result in any unwanted behaviour, so the proposed 
> fix should be similarly low-impact.
> Also of note is that with this proposed change, there will be no calls to 
> {{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} 
> class, which should ensure that other deadlocks involving this method are not 
> possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8466) Create a ClassLoaderService to abstract away dealing with the default ClassLoader directly

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226452#comment-17226452
 ] 

ASF GitHub Bot commented on GEODE-8466:
---

lgtm-com[bot] commented on pull request #5658:
URL: https://github.com/apache/geode/pull/5658#issuecomment-722075908


   This pull request **introduces 3 alerts** and **fixes 1** when merging 
19b1313d9d31dc3320e5659555649133e991db13 into 
7367d17e3817fc41666d471c5eb4d0df0d33c18b - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-1bf6e8475bff02a5fe107ea9ef2ef16a37961649)
   
   **new alerts:**
   
   * 2 for Potential input resource leak
   * 1 for Use of a broken or risky cryptographic algorithm
   
   **fixed alerts:**
   
   * 1 for Use of a broken or risky cryptographic algorithm



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create a ClassLoaderService to abstract away dealing with the default 
> ClassLoader directly
> --
>
> Key: GEODE-8466
> URL: https://issues.apache.org/jira/browse/GEODE-8466
> Project: Geode
>  Issue Type: New Feature
>  Components: core
>Reporter: Udo Kohlmeyer
>Assignee: Udo Kohlmeyer
>Priority: Major
>  Labels: pull-request-available
>
> With the addition of ClassLoader isolation using JBoss Modules GEODE-8067, 
> the manner in which we interact with the ClassLoader needs to change.
> An abstraction is required around the default functions like 
> `findResourceAsStream`, `loadClass` and `loadService`.
> As these features will behave differently between different ClassLoader 
> implementations, it is best to have a single service that will expose that 
> functionality in a transparent manner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (GEODE-8626) Omitting field-mapping tag of cache.xml when using Simple JDBC Connector

2020-11-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226453#comment-17226453
 ] 

ASF GitHub Bot commented on GEODE-8626:
---

jchen21 commented on a change in pull request #5637:
URL: https://github.com/apache/geode/pull/5637#discussion_r517719407



##
File path: 
geode-connectors/src/distributedTest/java/org/apache/geode/connectors/jdbc/internal/cli/CreateMappingCommandDUnitTest.java
##
@@ -1142,7 +1142,7 @@ public void createMappingWithExistingQueueFails() {
 + " must not already exist.");
   }
 
-  private static class Employee implements PdxSerializable {
+  public static class Employee implements PdxSerializable {

Review comment:
   Why this class has to be `public`?

##
File path: 
geode-connectors/src/acceptanceTest/java/org/apache/geode/connectors/jdbc/CacheXmlJdbcMappingIntegrationTest.java
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional 
information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software 
distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 
KIND, either express
+ * or implied. See the License for the specific language governing permissions 
and limitations under
+ * the License.
+ */
+package org.apache.geode.connectors.jdbc;
+
+import static 
org.apache.geode.test.util.ResourceUtils.createTempFileFromResource;
+
+import org.junit.Rule;
+import org.junit.contrib.java.lang.system.RestoreSystemProperties;
+
+import org.apache.geode.cache.CacheFactory;
+import org.apache.geode.internal.cache.InternalCache;
+
+public class CacheXmlJdbcMappingIntegrationTest extends 
JdbcMappingIntegrationTest {
+
+  @Rule
+  public RestoreSystemProperties restoreSystemProperties = new 
RestoreSystemProperties();
+
+  @Override
+  protected InternalCache createCacheAndCreateJdbcMapping(String 
cacheXmlTestName)
+  throws Exception {
+String url = dbRule.getConnectionUrl().replaceAll("&", "&");

Review comment:
   Is this replacement of `&` necessary?

##
File path: 
geode-connectors/src/main/java/org/apache/geode/connectors/jdbc/internal/JdbcConnectorServiceImpl.java
##
@@ -210,4 +224,152 @@ private TableMetaDataView 
getTableMetaDataView(RegionMapping regionMapping,
   + regionMapping.getDataSourceName() + "\": ", ex);
 }
   }
+
+  @Override
+  public TableMetaDataView getTableMetaDataView(RegionMapping regionMapping) {
+DataSource dataSource = getDataSource(regionMapping.getDataSourceName());
+if (dataSource == null) {
+  throw new JdbcConnectorException("No datasource \"" + 
regionMapping.getDataSourceName()
+  + "\" found when getting table meta data \"" + 
regionMapping.getRegionName() + "\"");
+}
+return getTableMetaDataView(regionMapping, dataSource);
+  }
+
+  @Override
+  public List createDefaultFieldMapping(RegionMapping 
regionMapping,
+  PdxType pdxType) {
+DataSource dataSource = getDataSource(regionMapping.getDataSourceName());
+if (dataSource == null) {
+  throw new JdbcConnectorException("No datasource \"" + 
regionMapping.getDataSourceName()
+  + "\" found when creating mapping \"" + 
regionMapping.getRegionName() + "\"");

Review comment:
   The data source has nothing to do with table metadata or region name. I 
recommend removing this line of error message.

##
File path: 
geode-connectors/src/test/java/org/apache/geode/connectors/jdbc/internal/cli/CreateMappingPreconditionCheckFunctionTest.java
##
@@ -172,16 +168,6 @@ public void 
executeFunctionThrowsIfDataSourceDoesNotExist() {
 + DATA_SOURCE_NAME + "'.");
   }
 
-  @Test

Review comment:
   Why this test is removed?
   

##
File path: 
geode-connectors/src/acceptanceTest/java/org/apache/geode/connectors/jdbc/GfshJdbcMappingIntegrationTest.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more 
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional 
information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache 
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the 
License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, so

[jira] [Created] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)
Jakov Varenina created GEODE-8687:
-

 Summary: Durable client is continuously re-registering CQs on all 
servers when event de-serialization fails causing resource exhaustion on 
servers 
 Key: GEODE-8687
 URL: https://issues.apache.org/jira/browse/GEODE-8687
 Project: Geode
  Issue Type: Bug
  Components: client/server
Affects Versions: 1.13.0
Reporter: Jakov Varenina


When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Another problem arises because client destroys subscription connection and 
perform server fail-over whenever serialization issue occurs. Additionally when 
subscription connection for particular server fails multiple times then this 
server is put in deny list for 10 seconds (this is configurable with 
{{ping-interval}}). After 10s expire the server is removed from list and it is 
available for subscription connection which again fail. This will go 
indefinitely (if there are lots of events that cannot be de-serialized) and 
approx. every 10s in this case the client subscribes to each servers at least 
once. Due to serialization issue events aren't sent to client and remain in 
subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will be renewed each time client is registered. This could cause 
a lots of problems since memory and disk usage (if overflow on queue is 
configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Attachment: deserialzationFault.log

> Durable client is continuously re-registering CQs on all servers when event 
> de-serialization fails causing resource exhaustion on servers 
> --
>
> Key: GEODE-8687
> URL: https://issues.apache.org/jira/browse/GEODE-8687
> Project: Geode
>  Issue Type: Bug
>  Components: client/server
>Affects Versions: 1.13.0
>Reporter: Jakov Varenina
>Priority: Major
> Attachments: deserialzationFault.log
>
>
> When ReflectionBasedAutoSerializer is wrongly/not set it results with 
> serialization exception on client at the reception of the CQ events. 
> Serialization exception isn't logged which is misleading, and is hard to find 
> that actually ReflectionBasedAutoSerializer isn't set correctly. Only log 
> that can be seen is that client/servers subscription connections are closed 
> due to EOF. This is because client destroys subscriptions connections 
> intentionally, but doesn't log reason (PdxSerializationException) that led to 
> this. It would be good that serialization exceptions are logged as error or 
> warn.
> Another problem arises because client destroys subscription connection and 
> perform server fail-over whenever serialization issue occurs. Additionally 
> when subscription connection for particular server fails multiple times then 
> this server is put in deny list for 10 seconds (this is configurable with 
> {{ping-interval}}). After 10s expire the server is removed from list and it 
> is available for subscription connection which again fail. This will go 
> indefinitely (if there are lots of events that cannot be de-serialized) and 
> approx. every 10s in this case the client subscribes to each servers at least 
> once. Due to serialization issue events aren't sent to client and remain in 
> subscription queues.
> Whenever connection fails due to serialization issue and client is not 
> durable then subscription queue is closed and events are lost.
> The biggest problem arises when client is durable. This is because 
> subscription queue remains on server for configurable period of time (e.g. 
> 300s) waiting for client to reconnect. When client perform fail-over to 
> another server it will create new subscription queue using initial image from 
> old queue that is currently paused. This means that all events from old queue 
> will be transferred to new subscription queue hosted by the current primary 
> server. This will happen on all servers and all of them will have copy of the 
> queue. The problem here is that client will periodically (every 10s in this 
> case) establish connection to each servers, so configured timeout (e.g. 300s) 
> will never expire, but it will be renewed each time client is registered. 
> This could cause a lots of problems since memory and disk usage (if overflow 
> on queue is configured) will increase on all servers.
> You can find in attached logs for the problematic case with durable client :
> vm0          -> locator
> vm1, vm2   -> servers
> vm3  -> durable client with enabled subscription handling CQ 
> events
> vm4              -> client generating traffic that should trigger registered 
> CQ
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Description: 
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely (if there are lots of events that 
cannot be de-serialized) and approx. every 10s in this case the client 
subscribes to each servers at least once. Due to serialization issue events 
aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will be renewed each time client is registered. This could cause 
a lots of problems since memory and disk usage (if overflow on queue is 
configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 

  was:
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Another problem arises because client destroys subscription connection and 
perform server fail-over whenever serialization issue occurs. Additionally when 
subscription connection for particular server fails multiple times then this 
server is put in deny list for 10 seconds (this is configurable with 
{{ping-interval}}). After 10s expire the server is removed from list and it is 
available for subscription connection which again fail. This will go 
indefinitely (if there are lots of events that cannot be de-serialized) and 
approx. every 10s in this case the client subscribes to each servers at least 
once. Due to serialization issue events aren't sent to client and remain in 
subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will b

[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Description: 
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will be renewed each time client is registered. This could cause 
a lots of problems since memory and disk usage (if overflow on queue is 
configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 

  was:
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely (if there are lots of events that 
cannot be de-serialized) and approx. every 10s in this case the client 
subscribes to each servers at least once. Due to serialization issue events 
aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will be renewed each time client is registered. This could cause 
a lots of problems since memory 

[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Description: 
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue even 
subscription redundancy isn't configured. The problem here is that client will 
periodically (every 10s in this case) establish connection to each servers, so 
configured timeout (e.g. 300s) will never expire, but it will be renewed each 
time client is registered. This could cause a lots of problems since memory and 
disk usage (if overflow on queue is configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 

  was:
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue even 
redundancy isn't configured. The problem here is that client will periodically 
(every 10s in this case) establish connection to each servers, so configured 
timeout (e.g. 300s) will never expire, but it will be renewed each time client 
is registered. This could cause a lots of pro

[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Description: 
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue even 
redundancy isn't configured. The problem here is that client will periodically 
(every 10s in this case) establish connection to each servers, so configured 
timeout (e.g. 300s) will never expire, but it will be renewed each time client 
is registered. This could cause a lots of problems since memory and disk usage 
(if overflow on queue is configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 

  was:
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue. The problem 
here is that client will periodically (every 10s in this case) establish 
connection to each servers, so configured timeout (e.g. 300s) will never 
expire, but it will be renewed each time client is registered. This could cause 
a lots of problems since memory and disk usage (if overflow

[jira] [Updated] (GEODE-8614) Provide an specific client-side exception for server LowMemoryException

2020-11-04 Thread Mario Salazar de Torres (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mario Salazar de Torres updated GEODE-8614:
---
Description: 
*AS AN* native client contributor
 *I WANT* to have a client-side exception for LowMemoryException
 *SO THAT* I can nofity accordingly from the client-side upon server 
memory-depletion.

—

*Additional information*
 This is the callstack of the LowMemoryException:
{noformat}
[error 2020/10/13 09:54:14.401405 UTC 140522117220352] Region::put: An 
exception (org.apache.geode.cache.LowMemoryException: PartitionedRegion: 
/part_a cannot process operation on key foo|0 because members 
[192.168.240.14(dms-server-1:1):41000] are running low on memory
at 
org.apache.geode.internal.cache.partitioned.RegionAdvisor.checkIfBucketSick(RegionAdvisor.java:482)
at 
org.apache.geode.internal.cache.PartitionedRegion.checkIfAboveThreshold(PartitionedRegion.java:2278)
at 
org.apache.geode.internal.cache.PartitionedRegion.putInBucket(PartitionedRegion.java:2982)
at 
org.apache.geode.internal.cache.PartitionedRegion.virtualPut(PartitionedRegion.java:2212)
at 
org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:170)
at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5573)
at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5533)
at 
org.apache.geode.internal.cache.LocalRegion.basicBridgePut(LocalRegion.java:5212)
at 
org.apache.geode.internal.cache.tier.sockets.command.Put65.cmdExecute(Put65.java:411)
at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:183)
at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
at 
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676)
at 
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
at java.base/java.lang.Thread.run(Thread.java:834) ) happened at remote server.
{noformat}
Idea would be to modify *ThinClientRegion::handleServerException* in order to 
return a new error and later on, map it to a new created exception

*Suggestions*
 The new exception could be called:
 * CacheServerLowMemoryException
 * ...

  was:
*AS AN* native client contributor
*I WANT* to have a client-side exception for LowMemoryException
*SO THAT* I can nofity accordingly from the client-side upon server 
memory-depletion.

---

*Additional information*
This is the callstack of the LowMemoryException:

{noformat}
[error 2020/10/13 09:54:14.401405 UTC 140522117220352] Region::put: An 
exception (org.apache.geode.cache.LowMemoryException: PartitionedRegion: 
/part_a cannot process operation on key foo|0 because members 
[192.168.240.14(dms-server-1:1):41000] are running low on memory at 
org.apache.geode.internal.cache.partitioned.RegionAdvisor.checkIfBucketSick(RegionAdvisor.java:482)
 at 
org.apache.geode.internal.cache.PartitionedRegion.checkIfAboveThreshold(PartitionedRegion.java:2278)
 at 
org.apache.geode.internal.cache.PartitionedRegion.putInBucket(PartitionedRegion.java:2982)
 at 
org.apache.geode.internal.cache.PartitionedRegion.virtualPut(PartitionedRegion.java:2212)
 at 
org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:170)
 at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5573) 
at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5533) 
at 
org.apache.geode.internal.cache.LocalRegion.basicBridgePut(LocalRegion.java:5212)
 at 
org.apache.geode.internal.cache.tier.sockets.command.Put65.cmdExecute(Put65.java:411)
 at 
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:183)
 at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
 at 
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
 at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676)
 at 
org.apache.geode.logging.internal.executo

[jira] [Updated] (GEODE-8687) Durable client is continuously re-registering CQs on all servers when event de-serialization fails causing resource exhaustion on servers

2020-11-04 Thread Jakov Varenina (Jira)


 [ 
https://issues.apache.org/jira/browse/GEODE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakov Varenina updated GEODE-8687:
--
Description: 
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which will be destroyed again due serialization issue. This will go 
indefinitely and approx. every 10s in this case the client subscribes to each 
servers at least once. Due to serialization issue events aren't sent to client 
and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue even 
subscription redundancy isn't configured. The problem here is that client will 
periodically (every 10s in this case) establish connection to each servers, so 
configured timeout (e.g. 300s) will never expire, but it will be renewed each 
time client is registered. This could cause a lots of problems since memory and 
disk usage (if overflow on queue is configured) will increase on all servers.

You can find in attached logs for the problematic case with durable client :

vm0          -> locator
vm1, vm2   -> servers
vm3  -> durable client with enabled subscription handling CQ events
vm4              -> client generating traffic that should trigger registered CQ
 

  was:
When ReflectionBasedAutoSerializer is wrongly/not set it results with 
serialization exception on client at the reception of the CQ events. 
Serialization exception isn't logged which is misleading, and is hard to find 
that actually ReflectionBasedAutoSerializer isn't set correctly. Only log that 
can be seen is that client/servers subscription connections are closed due to 
EOF. This is because client destroys subscriptions connections intentionally, 
but doesn't log reason (PdxSerializationException) that led to this. It would 
be good that serialization exceptions are logged as error or warn.

Client destroys subscription connection and perform server fail-over whenever 
serialization issue occurs. Additionally when subscription connection for 
particular server fails multiple times then this server is put in deny list for 
10 seconds (this is configurable with {{ping-interval}}). After 10s expire the 
server is removed from list and it is available for subscription connection 
which again fail. This will go indefinitely and approx. every 10s in this case 
the client subscribes to each servers at least once. Due to serialization issue 
events aren't sent to client and remain in subscription queues.

Whenever connection fails due to serialization issue and client is not durable 
then subscription queue is closed and events are lost.

The biggest problem arises when client is durable. This is because subscription 
queue remains on server for configurable period of time (e.g. 300s) waiting for 
client to reconnect. When client perform fail-over to another server it will 
create new subscription queue using initial image from old queue that is 
currently paused. This means that all events from old queue will be transferred 
to new subscription queue hosted by the current primary server. This will 
happen on all servers and all of them will have copy of the queue even 
subscription redundancy isn't configured. The problem here is that client will 
periodically (every 10s in this case) establish connection to each servers, so 
configured timeout (e.g. 300s) will never expire, but it will be renewed each 
time c