[ 
https://issues.apache.org/jira/browse/GEODE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223253#comment-17223253
 ] 

ASF GitHub Bot commented on GEODE-8652:
---------------------------------------

echobravopapa commented on a change in pull request #5666:
URL: https://github.com/apache/geode/pull/5666#discussion_r514599161



##########
File path: 
geode-core/src/main/java/org/apache/geode/internal/net/NioSslEngine.java
##########
@@ -40,48 +40,48 @@
 import org.apache.logging.log4j.Logger;
 
 import org.apache.geode.GemFireIOException;
-import org.apache.geode.annotations.internal.MakeImmutable;
+import org.apache.geode.annotations.VisibleForTesting;
 import org.apache.geode.internal.net.BufferPool.BufferType;
+import org.apache.geode.internal.net.ByteBufferSharingImpl.OpenAttemptTimedOut;
 import org.apache.geode.logging.internal.log4j.api.LogService;
 
 
 /**
- * NioSslEngine uses an SSLEngine to bind SSL logic to a data source. This 
class is not thread
- * safe. Its use should be confined to one thread or should be protected by 
external
- * synchronization.
+ * NioSslEngine uses an SSLEngine to bind SSL logic to a data source. This 
class is not thread safe.
+ * Its use should be confined to one thread or should be protected by external 
synchronization.
  */
 public class NioSslEngine implements NioFilter {
   private static final Logger logger = LogService.getLogger();
 
-  // this variable requires the MakeImmutable annotation but the buffer is 
empty and
-  // not really modifiable
-  @MakeImmutable
-  private static final ByteBuffer EMPTY_BUFFER = ByteBuffer.allocate(0);
-
   private final BufferPool bufferPool;
 
   private boolean closed;
 
   SSLEngine engine;
 
   /**
-   * myNetData holds bytes wrapped by the SSLEngine
+   * holds bytes wrapped by the SSLEngine; a.k.a. myNetData

Review comment:
       i appreciate the buffer naming with their respective directions
   maybe in the future this decoration can be extended to the uses of 
`peerAppData` and 'myNetData`  just to reduce cognitive load and increase 
readability

##########
File path: 
geode-core/src/main/java/org/apache/geode/internal/net/NioSslEngine.java
##########
@@ -368,17 +365,15 @@ public void doneReadingDirectAck(ByteBuffer 
unwrappedBuffer) {
     // read-operations
   }
 
-  @Override
-  public synchronized boolean isClosed() {
-    return closed;
-  }
-
   @Override
   public synchronized void close(SocketChannel socketChannel) {
     if (closed) {
       return;
     }
-    try {
+    closed = true;
+    inputSharing.destruct();
+    try (final ByteBufferSharing outputSharing = shareOutputBuffer(1, 
TimeUnit.MINUTES)) {

Review comment:
       this probably warrants a comment for posterity...

##########
File path: 
geode-core/src/main/java/org/apache/geode/internal/net/NioSslEngine.java
##########
@@ -405,14 +400,13 @@ public synchronized void close(SocketChannel 
socketChannel) {
       // we can't send a close message if the channel is closed
     } catch (IOException e) {
       throw new GemFireIOException("exception closing SSL session", e);
+    } catch (final OpenAttemptTimedOut _unused) {
+      logger.info(String.format("Couldn't get output lock in time, eliding TLS 
close message"));
+      if (!engine.isOutboundDone()) {
+        engine.closeOutbound();
+      }
     } finally {
-      ByteBuffer netData = myNetData;
-      ByteBuffer appData = peerAppData;
-      myNetData = null;
-      peerAppData = EMPTY_BUFFER;
-      bufferPool.releaseBuffer(TRACKED_SENDER, netData);
-      bufferPool.releaseBuffer(TRACKED_RECEIVER, appData);
-      this.closed = true;
+      outputSharing.destruct();

Review comment:
       much cleaner, like it!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> member hung in Connection.notifyHandshakeWaiter() during disconnect waiting 
> for a lock held by another thread in Connection.readAck() 
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-8652
>                 URL: https://issues.apache.org/jira/browse/GEODE-8652
>             Project: Geode
>          Issue Type: Bug
>          Components: membership, messaging
>    Affects Versions: 1.14.0
>            Reporter: Bill Burcham
>            Assignee: Bill Burcham
>            Priority: Major
>              Labels: pull-request-available
>
> An application encountered the following hang in a TLS-enabled cluster.
> Let's call the cluster members ds3 -> ds1. 
> ds3 sends a {{PutAllPRMessage}} to ds1 and is stuck in 
> {{SocketChannel.read()}} waiting for the acknowledgement.
> {{ClusterDistributionManager.uncleanShutdown()}} is invoked on ds3 to shut 
> the member down. That thread blocks trying to acquire a lock on the 
> {{NioSslEngine}} held by the first thread (the one doing waiting for the ack 
> to the put-all.)
> Somehow the shutdown thread must be allowed to proceed.
> Here's the hung thread in ds3 (a.k.a. vm_3_thr_34_dataStore3_host1_8592) 
> trying to shut down the member but it's stuck waiting for the monitor on the 
> {{NioSslEngine}}:
> {noformat}
> "vm_3_thr_34_dataStore3_host1_8592" #795 daemon prio=5 os_prio=0 
> tid=0x00007fdb9c011000 nid=0x2fcc waiting for monitor entry 
> [0x00007fdb6f4b7000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>       at 
> org.apache.geode.internal.tcp.Connection.notifyHandshakeWaiter(Connection.java:804)
>       - waiting to lock <0x00000000f2635b28> (a 
> org.apache.geode.internal.net.NioSslEngine)
>       at org.apache.geode.internal.tcp.Connection.close(Connection.java:1350)
>       at 
> org.apache.geode.internal.tcp.Connection.closePartialConnect(Connection.java:1278)
>       at 
> org.apache.geode.internal.tcp.ConnectionTable.closeCon(ConnectionTable.java:612)
>       at 
> org.apache.geode.internal.tcp.ConnectionTable.closeCon(ConnectionTable.java:604)
>       at 
> org.apache.geode.internal.tcp.ConnectionTable.close(ConnectionTable.java:661)
>       - locked <0x00000000f2678cf8> (a java.util.ArrayList)
>       - locked <0x00000000f1187348> (a java.util.concurrent.ConcurrentHashMap)
>       at org.apache.geode.internal.tcp.TCPConduit.stop(TCPConduit.java:487)
>       at 
> org.apache.geode.distributed.internal.direct.DirectChannel.disconnect(DirectChannel.java:644)
>       - locked <0x00000000f11867a8> (a 
> org.apache.geode.distributed.internal.direct.DirectChannel)
>       at 
> org.apache.geode.distributed.internal.DistributionImpl.disconnectDirectChannel(DistributionImpl.java:631)
>       at 
> org.apache.geode.distributed.internal.DistributionImpl.access$200(DistributionImpl.java:82)
>       at 
> org.apache.geode.distributed.internal.DistributionImpl$LifecycleListenerImpl.disconnect(DistributionImpl.java:904)
>       at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.stop(GMSMembership.java:1908)
>       at 
> org.apache.geode.distributed.internal.membership.gms.Services.stop(Services.java:302)
>       at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.shutdown(GMSMembership.java:1262)
>       at 
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.disconnect(GMSMembership.java:1847)
>       at 
> org.apache.geode.distributed.internal.DistributionImpl.disconnect(DistributionImpl.java:501)
>       at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.uncleanShutdown(ClusterDistributionManager.java:1291)
> {noformat}
> That thread is waiting on a lock held by this thread (in ds3) which is 
> waiting on an acknowledgement to a PutAllPRMessage sent to ds1.
> {noformat}
> "vm_3_thr_37_dataStore3_host1_8592" #857 daemon prio=5 os_prio=0 
> tid=0x00007fdb9c030800 nid=0x30d1 runnable [0x00007fdb732f0000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:378)
>         - locked <0x00000000f2643380> (a java.lang.Object)
>         at 
> org.apache.geode.internal.net.NioSslEngine.readAtLeast(NioSslEngine.java:330)
>         at 
> org.apache.geode.internal.tcp.MsgReader.readAtLeast(MsgReader.java:129)
>         at 
> org.apache.geode.internal.tcp.MsgReader.readHeader(MsgReader.java:58)
> ==>     - locked <0x00000000f2635b28> (a 
> org.apache.geode.internal.net.NioSslEngine)
>         at 
> org.apache.geode.internal.tcp.Connection.readAck(Connection.java:2652)
>         at 
> org.apache.geode.distributed.internal.direct.DirectChannel.readAcks(DirectChannel.java:392)
>         at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:342)
>         at 
> org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:182)
>         at 
> org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:511)
>         at 
> org.apache.geode.distributed.internal.DistributionImpl.directChannelSend(DistributionImpl.java:346)
>         at 
> org.apache.geode.distributed.internal.DistributionImpl.send(DistributionImpl.java:291)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2053)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:1981)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2018)
>         at 
> org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1083)
>         at 
> org.apache.geode.internal.cache.partitioned.PutAllPRMessage.send(PutAllPRMessage.java:201)
>         at 
> org.apache.geode.internal.cache.PartitionedRegion.tryToSendOnePutAllMessage(PartitionedRegion.java:2839)
>         at 
> org.apache.geode.internal.cache.PartitionedRegion.sendMsgByBucket(PartitionedRegion.java:2621)
>         at 
> org.apache.geode.internal.cache.PartitionedRegion.postPutAllSend(PartitionedRegion.java:2392)
>         at 
> org.apache.geode.internal.cache.LocalRegionDataView.postPutAll(LocalRegionDataView.java:361)
>         at 
> org.apache.geode.internal.cache.LocalRegion.basicPutAll(LocalRegion.java:9154)
>         at 
> org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:8903)
> {noformat}
> What we see is that the {{MsgReader}} in the second thread is not letting the 
> first thread close the socket. Until the socket is closed, the second thread 
> will be stuck in {{SocketChannel.read()}}.
> *But why is the second thread stuck in {{SocketChannelImpl.read}}? That may 
> be due to GEODE-8651!*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to