[ 
https://issues.apache.org/jira/browse/GEODE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Burcham updated GEODE-10122:
---------------------------------
    Description: 
TLSv1.3 introduced [1] the ability to set per-algorithm limits on symmetric key 
usage lifetimes. Once a certain number of bytes have been encrypted, a 
KeyUpdate post-handshake message [2] is sent.

With default settings, on Liberica JDK 11, Geode's P2P framework will negotiate 
TLSv1.3 with the TLS_AES_256_GCM_SHA384 cipher suite. Geode P2P messaging will 
eventually fail, with a "Tag mismatch!" IOException in shared ordered 
receivers, after a session has been in heavy use for days.

We have not see this failure on TLSv1.2.

The implementation of TLSv1.3 in the Java runtime provides a security property 
[3] to configure the encrypted data limit. The attached patch to 
P2PMessagingConcurrencyDUnitTest configures the limit large enough that the 
test makes it through the (P2P) TLS handshake but small enough so that the "Tag 
mismatch!" exception is encountered less than a minute later.

The bug is caused by Geode’s NioSslEngine class’ ignorance of the 
“rehandshaking” phase of the TLS protocol [4]:

    Creation - ready to be configured.

    Initial handshaking - perform authentication and negotiate communication 
parameters.

    Application data - ready for application exchange.

    *Rehandshaking* - renegotiate communications parameters/authentication; 
handshaking data may be mixed with application data.

    Closure - ready to shut down connection.

Geode's tcp.Connection and NioSslEngine classes (particularly wrap() and 
unwrap()), as they are currently implemented, fail to fully attend to the 
handshake status from javax.net.ssl.SSLEngine. As a result these Geode classes 
fail to respond to the KeyUpdate message, resulting in the "Tag mismatch!" 
IOException.

When that exception is encountered, the Connection is destroyed and a new one 
created in its place. But users of the old Connection, waiting for 
acknowledgements, will never receive them. This can result in cluster-wide 
hangs.

[1] [https://datatracker.ietf.org/doc/html/rfc8446#section-5.5]

[2] 
[https://www.ibm.com/docs/en/sdk-java-technology/8?topic=handshake-post-messages]
 

[3] 
[https://docs.oracle.com/en/java/javase/11/security/java-secure-socket-extension-jsse-reference-guide.html#GUID-B970ADD6-1E9F-4C18-A26E-0679B50CC946]

[4] [https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=sslengine-]

  was:
TLSv1.3 introduced [1] the ability to set per-algorithm limits on symmetric key 
usage lifetimes. Once a certain number of bytes have been encrypted, a 
KeyUpdate post-handshake message is sent.

With default settings, on Liberica JDK 11, Geode's P2P framework will negotiate 
TLSv1.3 with the TLS_AES_256_GCM_SHA384 cipher suite. Geode P2P messaging will 
eventually fail, with a "Tag mismatch!" IOException in shared ordered 
receivers, after a session has been in heavy use for days.

We have not see this failure on TLSv1.2.

The implementation of TLSv1.3 in the Java runtime provides a security property 
[2] to configure the encrypted data limit. The attached patch to 
P2PMessagingConcurrencyDUnitTest configures the limit large enough that the 
test makes it through the (P2P) TLS handshake but small enough so that the "Tag 
mismatch!" exception is encountered less than a minute later.

The bug is caused by Geode’s NioSslEngine class’ ignorance of the 
“rehandshaking” phase of the TLS protocol [3]:

    Creation - ready to be configured.

    Initial handshaking - perform authentication and negotiate communication 
parameters.

    Application data - ready for application exchange.

    *Rehandshaking* - renegotiate communications parameters/authentication; 
handshaking data may be mixed with application data.

    Closure - ready to shut down connection.

Geode's tcp.Connection and NioSslEngine classes (particularly wrap() and 
unwrap()), as they are currently implemented, fail to fully attend to the 
handshake status from javax.net.ssl.SSLEngine. As a result these Geode classes 
fail to respond to the KeyUpdate message, resulting in the "Tag mismatch!" 
IOException.

When that exception is encountered, the Connection is destroyed and a new one 
created in its place. But users of the old Connection, waiting for 
acknowledgements, will never receive them. This can result in cluster-wide 
hangs.

[1] [https://datatracker.ietf.org/doc/html/rfc8446#section-5.5]

[2] 
[https://docs.oracle.com/en/java/javase/11/security/java-secure-socket-extension-jsse-reference-guide.html#GUID-B970ADD6-1E9F-4C18-A26E-0679B50CC946]
 

[3] [https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=sslengine-]


> With TLSv1.3 and GCM-based cipher (the default), P2P Messaging Fails When 
> Encrypted Data Limit is Reached
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-10122
>                 URL: https://issues.apache.org/jira/browse/GEODE-10122
>             Project: Geode
>          Issue Type: Bug
>          Components: messaging
>    Affects Versions: 1.13.7, 1.14.3, 1.15.0, 1.16.0
>            Reporter: Bill Burcham
>            Assignee: Bill Burcham
>            Priority: Major
>         Attachments: patch-P2PMessagingConcurrencyDUnitTest.txt
>
>
> TLSv1.3 introduced [1] the ability to set per-algorithm limits on symmetric 
> key usage lifetimes. Once a certain number of bytes have been encrypted, a 
> KeyUpdate post-handshake message [2] is sent.
> With default settings, on Liberica JDK 11, Geode's P2P framework will 
> negotiate TLSv1.3 with the TLS_AES_256_GCM_SHA384 cipher suite. Geode P2P 
> messaging will eventually fail, with a "Tag mismatch!" IOException in shared 
> ordered receivers, after a session has been in heavy use for days.
> We have not see this failure on TLSv1.2.
> The implementation of TLSv1.3 in the Java runtime provides a security 
> property [3] to configure the encrypted data limit. The attached patch to 
> P2PMessagingConcurrencyDUnitTest configures the limit large enough that the 
> test makes it through the (P2P) TLS handshake but small enough so that the 
> "Tag mismatch!" exception is encountered less than a minute later.
> The bug is caused by Geode’s NioSslEngine class’ ignorance of the 
> “rehandshaking” phase of the TLS protocol [4]:
>     Creation - ready to be configured.
>     Initial handshaking - perform authentication and negotiate communication 
> parameters.
>     Application data - ready for application exchange.
>     *Rehandshaking* - renegotiate communications parameters/authentication; 
> handshaking data may be mixed with application data.
>     Closure - ready to shut down connection.
> Geode's tcp.Connection and NioSslEngine classes (particularly wrap() and 
> unwrap()), as they are currently implemented, fail to fully attend to the 
> handshake status from javax.net.ssl.SSLEngine. As a result these Geode 
> classes fail to respond to the KeyUpdate message, resulting in the "Tag 
> mismatch!" IOException.
> When that exception is encountered, the Connection is destroyed and a new one 
> created in its place. But users of the old Connection, waiting for 
> acknowledgements, will never receive them. This can result in cluster-wide 
> hangs.
> [1] [https://datatracker.ietf.org/doc/html/rfc8446#section-5.5]
> [2] 
> [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=handshake-post-messages]
>  
> [3] 
> [https://docs.oracle.com/en/java/javase/11/security/java-secure-socket-extension-jsse-reference-guide.html#GUID-B970ADD6-1E9F-4C18-A26E-0679B50CC946]
> [4] [https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=sslengine-]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to