Classloader usage

2022-03-18 Thread Piotr P. Karwasz
Hello,

I am trying to understand the way the context classloader and
`ServiceLoader` are used in Log4j 2.x[1].

The `log4j-api` uses `LoaderUtil.getClassLoaders` in several places to
lookup services from the context classloader and the `log4j-api`
classloader.

This procedure has IMHO some problems:
1. The services retrieved this way are assigned to static fields
(except the call in `WatchEventService`): this causes a memory leak if
`log4j-api` is in the server's classloader and the retrieved service
in the web application classloader,

2. There is the small problem that if any service found by
`ServiceLoader.load(Class, ClassLoader)` fails the `instanceof` check,
all the others are thrown away. I solved it in `ServiceLoaderUtil`,

3. Probably this is not a problem, but I fail to understand why we
look up services separately for a classloader and each of its
ancestors.
  If the classloader implements the "delegate to parent" convention,
we'll end up several instances of the same class.
   If the classloader is a webapp classloader and the service
interface is in the webapp classloader, all services in the
classloader ancestors will fail the 'instanceof' test.
   The only situation where this makes sense is if the service
interface is in the server classloader (e.g. `javax.sql.DataSource`),
then we can have two different services with the same class name. Is
this the reason?

4. The following condition in `ThreadContextDataInjector` is pure magic for me:
if (providers.stream().noneMatch(p ->
p.getClass().isAssignableFrom(provider.getClass( {
providers.add(provider);
}
   As I understand it, if we have a base class, we don't want its
derived classes?

I have don't some refactoring of the ServiceLoader code in:
https://github.com/apache/logging-log4j2/pull/804


[1] https://issues.apache.org/jira/browse/LOG4J2-3427


Re: Classloader usage

2022-03-18 Thread Matt Sicker
It sounds like we could use some cleanup around here. Number 3 is
particularly interesting.

As for number 4, that seems to be a form of uniqueness checking (no
two providers of the same class).

On Fri, Mar 18, 2022 at 4:50 AM Piotr P. Karwasz
 wrote:
>
> Hello,
>
> I am trying to understand the way the context classloader and
> `ServiceLoader` are used in Log4j 2.x[1].
>
> The `log4j-api` uses `LoaderUtil.getClassLoaders` in several places to
> lookup services from the context classloader and the `log4j-api`
> classloader.
>
> This procedure has IMHO some problems:
> 1. The services retrieved this way are assigned to static fields
> (except the call in `WatchEventService`): this causes a memory leak if
> `log4j-api` is in the server's classloader and the retrieved service
> in the web application classloader,
>
> 2. There is the small problem that if any service found by
> `ServiceLoader.load(Class, ClassLoader)` fails the `instanceof` check,
> all the others are thrown away. I solved it in `ServiceLoaderUtil`,
>
> 3. Probably this is not a problem, but I fail to understand why we
> look up services separately for a classloader and each of its
> ancestors.
>   If the classloader implements the "delegate to parent" convention,
> we'll end up several instances of the same class.
>If the classloader is a webapp classloader and the service
> interface is in the webapp classloader, all services in the
> classloader ancestors will fail the 'instanceof' test.
>The only situation where this makes sense is if the service
> interface is in the server classloader (e.g. `javax.sql.DataSource`),
> then we can have two different services with the same class name. Is
> this the reason?
>
> 4. The following condition in `ThreadContextDataInjector` is pure magic for 
> me:
> if (providers.stream().noneMatch(p ->
> p.getClass().isAssignableFrom(provider.getClass( {
> providers.add(provider);
> }
>As I understand it, if we have a base class, we don't want its
> derived classes?
>
> I have don't some refactoring of the ServiceLoader code in:
> https://github.com/apache/logging-log4j2/pull/804
>
>
> [1] https://issues.apache.org/jira/browse/LOG4J2-3427


Re: Classloader usage

2022-03-18 Thread Piotr P. Karwasz
On Fri, 18 Mar 2022 at 10:50, Piotr P. Karwasz  wrote:
> I am trying to understand the way the context classloader and
> `ServiceLoader` are used in Log4j 2.x[1].

Since I accidentally clicked "Send" on the previous email before
finishing it, I should probably add some context to the questions
above.

At work we are still using JCL with Log4j 1.x. The nice feature of JCL
is that even if we need to distribute `commons-logging` in the WAR
files, we don't need to attach `log4j`: a server admin can simply add
it to Tomcat's classloader. Log4j 2.x API (and SLF4J) can not work
like this unless:
1. The admin changes the delegation model of the web application
classloader (to look up in the parent's classloader first) and adds
Log4j Core to the server's classloader. Honestly I don't see anything
wrong in doing this on a clean Tomcat, but it is not the way webapps
were designed to work,
   OR
2. The admin adds Log4j Core to each application, either by physically
adding the file to `/WEB-INF/lib` or using Tomcat's virtual resources.
I don't like the former, because it modifies the contents of the WAR
file, while the latter requires some knowledge of Tomcat
administration. Our clients often don't have professional admins or at
least admins with some expertise in Tomcat's administration and my
colleagues in the Call Center really don't like modified installations
and they often forward those problems to us.

Therefore my ideal solution would be to only include `log4j-api` (and
`log4j-web) in our applications (99% of our clients just log to the
console), but with a simple way to use the full version of Log4j 2.x
Core. This could be done:
* either by writing a Log4j API implementation that delegates to
Tomcat JULI (Tomcat does not allow applications to override
`org.apache.juli` classes),
* or by writing an implementation that delegates each call to the
Log4j API copy in the server's classloader.

Piotr


Re: Classloader usage

2022-03-18 Thread Ralph Goers
This is exactly why we love getting new committers on board!

When I wrote this code my thinking was that I have no idea what ClassLoader 
scheme the app is running in. Heck, JBoss had at least 3 of them in the time 
I worked with it years ago.  So the idea here is to simply find implementations 
wherever they can be found and use the ones that work. I know it isn’t pretty 
but I couldn’t figure out a better way. This is one of the major benefits of 
JPMS 
is that it definitely cleans this up.

I have no idea what is going on with item 4.

Ralph

> On Mar 18, 2022, at 2:50 AM, Piotr P. Karwasz  wrote:
> 
> Hello,
> 
> I am trying to understand the way the context classloader and
> `ServiceLoader` are used in Log4j 2.x[1].
> 
> The `log4j-api` uses `LoaderUtil.getClassLoaders` in several places to
> lookup services from the context classloader and the `log4j-api`
> classloader.
> 
> This procedure has IMHO some problems:
> 1. The services retrieved this way are assigned to static fields
> (except the call in `WatchEventService`): this causes a memory leak if
> `log4j-api` is in the server's classloader and the retrieved service
> in the web application classloader,
> 
> 2. There is the small problem that if any service found by
> `ServiceLoader.load(Class, ClassLoader)` fails the `instanceof` check,
> all the others are thrown away. I solved it in `ServiceLoaderUtil`,
> 
> 3. Probably this is not a problem, but I fail to understand why we
> look up services separately for a classloader and each of its
> ancestors.
>  If the classloader implements the "delegate to parent" convention,
> we'll end up several instances of the same class.
>   If the classloader is a webapp classloader and the service
> interface is in the webapp classloader, all services in the
> classloader ancestors will fail the 'instanceof' test.
>   The only situation where this makes sense is if the service
> interface is in the server classloader (e.g. `javax.sql.DataSource`),
> then we can have two different services with the same class name. Is
> this the reason?
> 
> 4. The following condition in `ThreadContextDataInjector` is pure magic for 
> me:
>if (providers.stream().noneMatch(p ->
> p.getClass().isAssignableFrom(provider.getClass( {
>providers.add(provider);
> }
>   As I understand it, if we have a base class, we don't want its
> derived classes?
> 
> I have don't some refactoring of the ServiceLoader code in:
> https://github.com/apache/logging-log4j2/pull/804
> 
> 
> [1] https://issues.apache.org/jira/browse/LOG4J2-3427