[Bug 65661] New: 'OutOfMemoryError: Direct buffer memory' in DiskFileItem.get()

2021-10-28 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=65661

Bug ID: 65661
   Summary: 'OutOfMemoryError: Direct buffer memory' in
DiskFileItem.get()
   Product: Tomcat 9
   Version: 9.0.52
  Hardware: PC
OS: Linux
Status: NEW
  Severity: normal
  Priority: P2
 Component: Util
  Assignee: dev@tomcat.apache.org
  Reporter: peter.kov...@swisscom.com
  Target Milestone: -

DiskFileItem.get() loads a whole multipart part into memory. Since the commit
6650205974619771f9ffe19d1b7a5490ce468e9d it uses Files.newInputStream(...) to
read the file contents. This creates a ChannelInputStream behind the scenes
which might use direct memory instead of heap.

This should work correctly except that usually direct memory limit is set much
lower for applications than regular heap size. Thus this may result in
'OutOfMemoryError: Direct buffer memory' for large or multiple concurrent
multipart uploads, when  -XX:MaxDirectMemorySize is not set high enough. (in
our case it was 128M).

Not sure how to fix it to keep the performance the same, but I think at least
it should be documented somewhere that applications need at least multipart
size limit * max. concurrent uploads bytes of direct memory.



Suspected change:
https://github.com/apache/tomcat/commit/6650205974619771f9ffe19d1b7a5490ce468e9d
Location:
https://github.com/apache/tomcat/blame/main/java/org/apache/tomcat/util/http/fileupload/disk/DiskFileItem.java#L304


JVM: openjdk 11.0.13 2021-10-19 LTS


Ok with 9.0.48, fails on 9.0.54

Stacktrace:
java.lang.OutOfMemoryError: Direct buffer memory
at java.base/java.nio.Bits.reserveMemory(Unknown Source)
at java.base/java.nio.DirectByteBuffer.(Unknown Source)
at java.base/java.nio.ByteBuffer.allocateDirect(Unknown Source)
at java.base/sun.nio.ch.Util.getTemporaryDirectBuffer(Unknown Source)
at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.read(Unknown Source)
at java.base/sun.nio.ch.ChannelInputStream.read(Unknown Source)
at java.base/sun.nio.ch.ChannelInputStream.read(Unknown Source)
at java.base/sun.nio.ch.ChannelInputStream.read(Unknown Source)
at
org.apache.tomcat.util.http.fileupload.IOUtils.read(IOUtils.java:199)
at
org.apache.tomcat.util.http.fileupload.IOUtils.readFully(IOUtils.java:226)
at
org.apache.tomcat.util.http.fileupload.IOUtils.readFully(IOUtils.java:247)
at
org.apache.tomcat.util.http.fileupload.disk.DiskFileItem.get(DiskFileItem.java:305)
at
org.apache.tomcat.util.http.fileupload.disk.DiskFileItem.getString(DiskFileItem.java:327)
at
org.apache.catalina.core.ApplicationPart.getString(ApplicationPart.java:127)
at org.apache.catalina.connector.Request.parseParts(Request.java:2948)
at org.apache.catalina.connector.Request.getParts(Request.java:2823)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



[Bug 65661] 'OutOfMemoryError: Direct buffer memory' in DiskFileItem.get()

2021-10-28 Thread bugzilla
https://bz.apache.org/bugzilla/show_bug.cgi?id=65661

Peter Kovacs  changed:

   What|Removed |Added

Version|9.0.52  |9.0.54

-- 
You are receiving this mail because:
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Panama and tomcat-native

2021-10-28 Thread Christopher Schultz

Rémy,

On 10/26/21 07:46, Rémy Maucherat wrote:

During the past weeks, I examined the state of the Panama project and
what it could do. I know Mark had a look at it three years ago, and it
was not ready yet. This does not appear to be the case anymore and I
could produce a wrapper for OpenSSL and a fully functional
implementation of the OpenSSLContext/OpenSSLEngine that does not use
tomcat-native.


Cool. I've only read the README at this point, but can I ask some questions?

0. If this is in Java 17, why can't we use a stock Java 17 for this 
purpose instead of using the forked Java 18 development build?


1. This (currently lengthy) process produces a JAR file, 100% Java code?

2. It's the JAR file (well, really bytecode) that is huge when 
supporting the entire OpenSSL API?


3. What "problems" are caused by the large size of that library?


I think this could be integrated in Tomcat as a module like
"modules/jdbc-pool". Here, likely "modules/openssl-panama".


4. Would modules/openssl-panama essentially be a (potentially 
complicated) build script without any code?


-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Panama and tomcat-native

2021-10-28 Thread Rémy Maucherat
On Thu, Oct 28, 2021 at 4:06 PM Christopher Schultz
 wrote:
>
> Rémy,
>
> On 10/26/21 07:46, Rémy Maucherat wrote:
> > During the past weeks, I examined the state of the Panama project and
> > what it could do. I know Mark had a look at it three years ago, and it
> > was not ready yet. This does not appear to be the case anymore and I
> > could produce a wrapper for OpenSSL and a fully functional
> > implementation of the OpenSSLContext/OpenSSLEngine that does not use
> > tomcat-native.
>
> Cool. I've only read the README at this point, but can I ask some questions?
>
> 0. If this is in Java 17, why can't we use a stock Java 17 for this
> purpose instead of using the forked Java 18 development build?

The API has changed significantly already from Java 17, and will
change more. So for now I prefer targeting the upstream API and
benefit from fixes and improvements.
Also, jextract is not available in the JDK, so I would have to find a
version of it that is compatible with the Java 17 API.

Once things are stable, I will likely attempt a backport to the Java 17 API.

> 1. This (currently lengthy) process produces a JAR file, 100% Java code?

Yes !

> 2. It's the JAR file (well, really bytecode) that is huge when
> supporting the entire OpenSSL API?

Yes, if you need the whole OpenSSL API, then it's a bit over 3MB.
Thankfully, I have now verified it can be trimmed down without causing
problems (and add new calls as needed later), so the current size with
everything is 133kB. This is great considering tomcat-native can be
dropped.

> 3. What "problems" are caused by the large size of that library?

Well, there was the problem of the amount of classes and raw size, but
more significantly the main class can be huge, so loading that
probably takes some cycles.

> > I think this could be integrated in Tomcat as a module like
> > "modules/jdbc-pool". Here, likely "modules/openssl-panama".
>
> 4. Would modules/openssl-panama essentially be a (potentially
> complicated) build script without any code?

Ok for the wrapper generated by jextract. But a new
OpenSSLContext/Engine that uses it is also needed, so that goes in the
module.

Right now, things look good as far as functionality goes. Everything
except OCSP is implemented. I do get crashes under handshake load, so
I likely messed something up somewhere, though. Performance is
equivalent to JNI/tomcat-native, so that's a huge if the end goal is
to drop tomcat-native.

The main downside of the API is that is is detyped. So you write C
code equivalent with only void* pointers, to give you an idea, which
you cast or use wherever you like without any warnings. So whatever
native bits remain are even less safe than before. I don't quite get
why MemoryAddress is not a generic type, like MemoryAddress
(jextract does generate an empty SSL type, along with the others, if
you let him) and that would make things nicer. Of course, it is likely
because it's not doable :D

Quick example: it took me hours to get ALPN working. Why ? Because the
OpenSSL API is "smart", and looks like:
   int SSL_callback_alpn_select_proto(SSL* ssl, const unsigned char
**out, unsigned char *outlen,
   const unsigned char *in, unsigned int inlen, void *arg)
So you see a few pointers and all is well. However, it was not working
for me, because I was using the length of the array, which is an int.
The pointer is byte size, however. Oops ;)

On a more positive example, tomcat-native sometimes has large amounts
of complex native code, in particular for OCSP support. This can now
be mostly rewritten in Java (besides the initial extraction of the
certificate information).

Rémy


>
> -chris
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Panama and tomcat-native

2021-10-28 Thread Christopher Schultz

Rémy,

On 10/28/21 10:51, Rémy Maucherat wrote:

On Thu, Oct 28, 2021 at 4:06 PM Christopher Schultz
 wrote:


Rémy,

On 10/26/21 07:46, Rémy Maucherat wrote:

During the past weeks, I examined the state of the Panama project and
what it could do. I know Mark had a look at it three years ago, and it
was not ready yet. This does not appear to be the case anymore and I
could produce a wrapper for OpenSSL and a fully functional
implementation of the OpenSSLContext/OpenSSLEngine that does not use
tomcat-native.


Cool. I've only read the README at this point, but can I ask some questions?

0. If this is in Java 17, why can't we use a stock Java 17 for this
purpose instead of using the forked Java 18 development build?


The API has changed significantly already from Java 17, and will
change more. So for now I prefer targeting the upstream API and
benefit from fixes and improvements.
Also, jextract is not available in the JDK, so I would have to find a
version of it that is compatible with the Java 17 API.

Once things are stable, I will likely attempt a backport to the Java 17 API.


Okay, thanks for that explanation.


1. This (currently lengthy) process produces a JAR file, 100% Java code?


Yes !


!!


2. It's the JAR file (well, really bytecode) that is huge when
supporting the entire OpenSSL API?


Yes, if you need the whole OpenSSL API, then it's a bit over 3MB.
Thankfully, I have now verified it can be trimmed down without causing
problems (and add new calls as needed later), so the current size with
everything is 133kB. This is great considering tomcat-native can be
dropped.


The latest version of tcnative that I built was 1.2MiB so even trading 
that for a 3MiB JAR file would be okay IMO. Getting it down to a few 
dozen KiB is great, too.


When trimming, do we just specify the individual native-C calls we need? 
Or do we need to understand transitive calls within the C library we are 
calling? I assume that we only have to cross the native barrier once per 
call, so those transitive calls are not relevant. We could even grep or 
@Annotate our Java code to specify which native calls we are making in 
order to auto-generate the list.



3. What "problems" are caused by the large size of that library?


Well, there was the problem of the amount of classes and raw size, but
more significantly the main class can be huge, so loading that
probably takes some cycles.


Which main class? Just from your test-driver? Or is there some big-init 
method you have to call to get OpenSSLJavaWrapper.class to initialize 
itself before you can make native calls?



I think this could be integrated in Tomcat as a module like
"modules/jdbc-pool". Here, likely "modules/openssl-panama".


4. Would modules/openssl-panama essentially be a (potentially
complicated) build script without any code?


Ok for the wrapper generated by jextract. But a new
OpenSSLContext/Engine that uses it is also needed, so that goes in the
module.


Aha, okay, so the JSSE module itself would be in here, and delegate all 
its crypto to OpenSSL.



Right now, things look good as far as functionality goes. Everything
except OCSP is implemented. I do get crashes under handshake load, so
I likely messed something up somewhere, though. Performance is
equivalent to JNI/tomcat-native, so that's a huge if the end goal is
to drop tomcat-native.

The main downside of the API is that is is detyped. So you write C
code equivalent with only void* pointers, to give you an idea, which
you cast or use wherever you like without any warnings. So whatever
native bits remain are even less safe than before. I don't quite get
why MemoryAddress is not a generic type, like MemoryAddress
(jextract does generate an empty SSL type, along with the others, if
you let him) and that would make things nicer. Of course, it is likely
because it's not doable :D

Quick example: it took me hours to get ALPN working. Why ? Because the
OpenSSL API is "smart", and looks like:
int SSL_callback_alpn_select_proto(SSL* ssl, const unsigned char
**out, unsigned char *outlen,
const unsigned char *in, unsigned int inlen, void *arg)
So you see a few pointers and all is well. However, it was not working
for me, because I was using the length of the array, which is an int.
The pointer is byte size, however. Oops ;)


:)

Hopefully it won't get to the point where our Java code needs to know 
the local (sizeof int).



On a more positive example, tomcat-native sometimes has large amounts
of complex native code, in particular for OCSP support. This can now
be mostly rewritten in Java (besides the initial extraction of the
certificate information).


That would indeed be an improvement. The less native code we are 
responsible for, the better.


-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Panama and tomcat-native

2021-10-28 Thread Rémy Maucherat
On Thu, Oct 28, 2021 at 6:13 PM Christopher Schultz
 wrote:
>
> Rémy,
>
> On 10/28/21 10:51, Rémy Maucherat wrote:
> > On Thu, Oct 28, 2021 at 4:06 PM Christopher Schultz
> >  wrote:
> >>
> >> Rémy,
> >>
> >> On 10/26/21 07:46, Rémy Maucherat wrote:
> >>> During the past weeks, I examined the state of the Panama project and
> >>> what it could do. I know Mark had a look at it three years ago, and it
> >>> was not ready yet. This does not appear to be the case anymore and I
> >>> could produce a wrapper for OpenSSL and a fully functional
> >>> implementation of the OpenSSLContext/OpenSSLEngine that does not use
> >>> tomcat-native.
> >>
> >> Cool. I've only read the README at this point, but can I ask some 
> >> questions?
> >>
> >> 0. If this is in Java 17, why can't we use a stock Java 17 for this
> >> purpose instead of using the forked Java 18 development build?
> >
> > The API has changed significantly already from Java 17, and will
> > change more. So for now I prefer targeting the upstream API and
> > benefit from fixes and improvements.
> > Also, jextract is not available in the JDK, so I would have to find a
> > version of it that is compatible with the Java 17 API.
> >
> > Once things are stable, I will likely attempt a backport to the Java 17 API.
>
> Okay, thanks for that explanation.
>
> >> 1. This (currently lengthy) process produces a JAR file, 100% Java code?
> >
> > Yes !
>
> !!
>
> >> 2. It's the JAR file (well, really bytecode) that is huge when
> >> supporting the entire OpenSSL API?
> >
> > Yes, if you need the whole OpenSSL API, then it's a bit over 3MB.
> > Thankfully, I have now verified it can be trimmed down without causing
> > problems (and add new calls as needed later), so the current size with
> > everything is 133kB. This is great considering tomcat-native can be
> > dropped.
>
> The latest version of tcnative that I built was 1.2MiB so even trading
> that for a 3MiB JAR file would be okay IMO. Getting it down to a few
> dozen KiB is great, too.
>
> When trimming, do we just specify the individual native-C calls we need?
> Or do we need to understand transitive calls within the C library we are
> calling? I assume that we only have to cross the native barrier once per
> call, so those transitive calls are not relevant. We could even grep or
> @Annotate our Java code to specify which native calls we are making in
> order to auto-generate the list.

It looks pretty much like reflection: you lookup only what you're
using. The "config file" is actually made of command line arguments
for jextract. It works but it's not very nice ;)
The calls to native are pretty boring, but the cool thing is the
callbacks. They are also using MethodHandle, and you can use bindTo.
So the native code just calls back your method *in your instance*, the
callbacks don't need to be static. So no more tricks to get the state
or pass it around (a large part of the tomcat-native hacks go away
right there).

> >> 3. What "problems" are caused by the large size of that library?
> >
> > Well, there was the problem of the amount of classes and raw size, but
> > more significantly the main class can be huge, so loading that
> > probably takes some cycles.
>
> Which main class? Just from your test-driver? Or is there some big-init
> method you have to call to get OpenSSLJavaWrapper.class to initialize
> itself before you can make native calls?

jextract generates a huge class (well, technically, it's a chain of
classes that extend each other) and its name is (by default), the name
of your header file. So here it is "openssl_h". The coding style for
Panama then uses "var" (the typing is rather boring) and static
imports.
For example, to create a new SSL context, you just do: "var sslCtx =
SSL_CTX_new(TLS_server_method());" which is exactly like the OpenSSL
API looks in C code.

> >>> I think this could be integrated in Tomcat as a module like
> >>> "modules/jdbc-pool". Here, likely "modules/openssl-panama".
> >>
> >> 4. Would modules/openssl-panama essentially be a (potentially
> >> complicated) build script without any code?
> >
> > Ok for the wrapper generated by jextract. But a new
> > OpenSSLContext/Engine that uses it is also needed, so that goes in the
> > module.
>
> Aha, okay, so the JSSE module itself would be in here, and delegate all
> its crypto to OpenSSL.
>
> > Right now, things look good as far as functionality goes. Everything
> > except OCSP is implemented. I do get crashes under handshake load, so
> > I likely messed something up somewhere, though. Performance is
> > equivalent to JNI/tomcat-native, so that's a huge if the end goal is
> > to drop tomcat-native.
> >
> > The main downside of the API is that is is detyped. So you write C
> > code equivalent with only void* pointers, to give you an idea, which
> > you cast or use wherever you like without any warnings. So whatever
> > native bits remain are even less safe than before. I don't quite get
> > why MemoryAddress is not a generic type, lik