https://bz.apache.org/bugzilla/show_bug.cgi?id=67938
Bug ID: 67938 Summary: Tomcat mishandles large client hello messages Product: Tomcat 10 Version: 10.1.15 Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: Connectors Assignee: dev@tomcat.apache.org Reporter: aogb...@redhat.com Target Milestone: ------ A java client application running previously with java 11 began seeing handshake failures with Tomcat 10.1 when the client app moved to java 17. OpenJDK engineers reviewed and based on the evidence gathered so far and after a static code analysis, we think that there is a problem in how Apache Tomcat handles TLS handshakes containing large Client Hello packets. We know that versions 10.1.9 to 10.1.15 are affected, but have not looked into other major releases. What follows is a high-level overview of the events that are happening, in our understanding, when the failure manifests: 1) The TLS client sends a Client Hello packet to resume a TLS 1.3 session. The packet is so large (26,660 bytes) that it has to be split into 2 TLS record messages. This splitting occurs at the TLS level, above any possible TCP fragmentation. The first TLS record has a length of 16,372 bytes and the second a length of 10,298 bytes (5 bytes of each TLS record are for the header, and the rest accounts for the Client Hello payload). 2) The method org.apache.tomcat.util.SecureNioChannel::handshake handles the incoming connection, on the TLS server side [1]. In particular, org.apache.tomcat.util.SecureNioChannel::processSNI is called first to peek at the incoming data and check, for example, if the SNI TLS extension is present [2]. 3) The most relevant outcomes of the org.apache.tomcat.util.SecureNioChannel::processSNI call are: 3.1) The SNI TLS extension is not present. This was probably decided here [3] because the Client Hello didn't fit into a single TLS record. SNI was not present anyways. 3.2) A new SSLEngine instance is created for the incoming connection. 3.3) The netInBuffer ByteBuffer is filled with bytes from the first TLS record sent by the client, and might include some but not all the bytes from the second TLS record. This is because netInBuffer is initialized to a default size of 16,921 bytes, and both TLS records total 26,670 bytes. netInBuffer is expanded to sslEngine.getSession().getPacketBufferSize() after a read from the network [4] but in practice, because there was no data passed to the SSLEngine yet, this is probably 16,709 bytes (max record size, taken from SSLRecord.maxRecordSize). Expanding to a smaller length has no effect. As a result, netInBuffer has a likely size of 16,921 bytes and is completely full of data. 3.4) netInBuffer is assumed to be in a write-ready state at this point, which means that position is set to the end of the filled data, limit is set to capacity, and more bytes can be appended. However, if it's completely full as assumed in #3.3, position would then be equal to limit (which is, in turn, equal to capacity) and more bytes cannot be appended. 4) When returning from org.apache.tomcat.util.SecureNioChannel::processSNI to org.apache.tomcat.util.SecureNioChannel::handshake, the field sniComplete is set to true reflecting that no further calls to ::processSNI are needed for this connection. Execution moves to org.apache.tomcat.util.SecureNioChannel::handshakeUnwrap because the initial state for a SSLEngine is NEED_UNWRAP [5]. 5) Once in org.apache.tomcat.util.SecureNioChannel::handshakeUnwrap, the "netInBuffer.position() == netInBuffer.limit()" condition evaluates to true [6] and the ByteBuffer::clear method is called on netInBuffer. Position is set to 0 and limit to capacity. As a result, any write to netInBuffer will overwrite unprocessed data. This unprocessed data is the first TLS record and part of the second TLS record, depending on how much is written. 6) More bytes are read into netInBuffer here [7]. Bytes read are probably the remainder of the second TLS record —we know that it's after the TLS record header and that it's at least 5 bytes long—, and the overwrite occurs as anticipated in #5. Data in netInBuffer is now corrupt. 7) The netInBuffer buffer is flipped to a read-ready state [8]. Thus, limit is set to the last position after the overwrite and position is set to 0. 8) netInBuffer is passed to the SSLEngine for unwrapping. The SSLEngine finds data at the beginning of the buffer that does not correspond to the beginning of a TLS record, and fails throwing the exception shown in the server log. We think that this error may not show up consistently due to network/OS timing conditions. Different JDK releases, server configurations and TLS protocol versions may also affect the length of the Client Hello message and have an impact on reproducibility. The reason why Client Hello messages for resumption are large in the analyzed client application case with OpenJDK 17 is because a large resumption ticket is passed, but large messages (spanning multiple TLS records) are compliant with the standard and should be handled appropriately. The following backport for OpenJDK 17 is also being pursued to reduce the message size in this case. Work arounds in this particular case have included keeping the java client app on java 11, limiting the client app to TLSv1.2, or setting "jdk.tls.client.enableSessionTicketExtension=false" on the client. Nonetheless, it looks like a flaw to address here in Tomcat for large client hello messages whether from some circumstance like above or something else. -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org