Change semantics of `Throwable` parameters

2023-11-21 Thread Piotr P. Karwasz
Hi,

Profiting from release 3.x we could slightly change the interpretation
of throwable parameters used in logging methods.

IIRC in one of our online meetings, we raised the issue of logging
statements like this:

logger.info("Error nr {}: {}", nr, throwable);

which is not considered best practice and is probably not what the
user intended:

logger.info("Error nr {}: {}", nr, throwable.getMessage(), throwable);

A similar problem was reported in PR #1050[2].

Regarding this kind of statements, the OSGi Log[1] Service provides a
very nice alternative interpretation on how to deal with throwable
parameters:

"If the last argument is a Throwable or a ServiceReference, it is
added to the generated LogEntry and then, if the next to last argument
is a ServiceReference or Throwable and not the same type as the last
argument, it is also added to the generated LogEntry. These arguments
will not be used as message arguments."

Should we adopt a similar interpretation in 3.x? The advantages I see are:

1. We fix some inconsistencies in the API:
 * we have a pair `info(String, Throwable)/info(String, Object)`
methods that prevents users from mistakenly pass a `Throwable` as a
log message argument,
 * we only have `info(Object)`,
 * we only have `info(String, Object, Object)`.
2. We simplify our logic: right now we need logic to deal with
throwables in `Message`. On the other hand throwables are not
considered part of a message by any layout and are also included
separately in LogEvent. If we don't need to chase throwables in
parameterized messages, we don't need to parse the format string until
it is required.
3. There might be some cases when including throwables in a log
message leads to information disclosure (e.g. when a user sees a log
message, but not the attached throwable). This way we'll have an API
which is easy to use and hard to misuse.

Piotr

[1] 
https://docs.osgi.org/javadoc/osgi.core/8.0.0/org/osgi/service/log/Logger.html
[2] https://github.com/apache/logging-log4j2/pull/1050


Re: Change semantics of `Throwable` parameters

2023-11-21 Thread Matt Sicker
This sounds like it might be a good basis for figuring out a parallel v3 API 
for a “hard to mis-use” style API. However, once you go that route, you start 
to wonder how useful templated log messages are when you can capture a lambda 
instead. Parameterized log messages might work better as structured log 
messages, something that is awkward to use in the API at the moment.
—
Matt Sicker

> On Nov 21, 2023, at 08:16, Piotr P. Karwasz  wrote:
> 
> Hi,
> 
> Profiting from release 3.x we could slightly change the interpretation
> of throwable parameters used in logging methods.
> 
> IIRC in one of our online meetings, we raised the issue of logging
> statements like this:
> 
> logger.info("Error nr {}: {}", nr, throwable);
> 
> which is not considered best practice and is probably not what the
> user intended:
> 
> logger.info("Error nr {}: {}", nr, throwable.getMessage(), throwable);
> 
> A similar problem was reported in PR #1050[2].
> 
> Regarding this kind of statements, the OSGi Log[1] Service provides a
> very nice alternative interpretation on how to deal with throwable
> parameters:
> 
> "If the last argument is a Throwable or a ServiceReference, it is
> added to the generated LogEntry and then, if the next to last argument
> is a ServiceReference or Throwable and not the same type as the last
> argument, it is also added to the generated LogEntry. These arguments
> will not be used as message arguments."
> 
> Should we adopt a similar interpretation in 3.x? The advantages I see are:
> 
> 1. We fix some inconsistencies in the API:
> * we have a pair `info(String, Throwable)/info(String, Object)`
> methods that prevents users from mistakenly pass a `Throwable` as a
> log message argument,
> * we only have `info(Object)`,
> * we only have `info(String, Object, Object)`.
> 2. We simplify our logic: right now we need logic to deal with
> throwables in `Message`. On the other hand throwables are not
> considered part of a message by any layout and are also included
> separately in LogEvent. If we don't need to chase throwables in
> parameterized messages, we don't need to parse the format string until
> it is required.
> 3. There might be some cases when including throwables in a log
> message leads to information disclosure (e.g. when a user sees a log
> message, but not the attached throwable). This way we'll have an API
> which is easy to use and hard to misuse.
> 
> Piotr
> 
> [1] 
> https://docs.osgi.org/javadoc/osgi.core/8.0.0/org/osgi/service/log/Logger.html
> [2] https://github.com/apache/logging-log4j2/pull/1050



Re: [log4j] Unstable tests on Windows

2023-11-21 Thread Matt Sicker
The GH CI build also fails randomly compared to the green builds that I push in 
the first place. Occasionally, this might be a platform-specific issue (which 
is what I’d ideally be discovering from CI), but the failing tests are 
generally fragile ones in the first place. While I have ideas on how to make 
those areas of the codebase more testable, some of them can’t be more readily 
tested without drastic measures. Specifically, our tests around rolling file 
appender things and anything else relying on scheduled actions. 
Besides using an actual file system in the tests (these could run faster if 
using a virtual file system instead like [jimfs][0]), the largest limitation 
here is that we rely on a ScheduledThreadPoolExecutor for scheduling actions 
like rollovers, and this class relies directly on System::nanoTime for 
scheduling decisions, so the only way to mock the clock here is to either run 
the test in a modified runtime where the system calls behind nanoTime are 
mocked or to find a similar OSS implementation (or adapt from the JDK here if 
the classes are public domain like the Doug Lea ones usually are) to use 
instead which allows for controlling the clock. The latter option is something 
I considered before, but it’s not something to be taken lightly since we 
already expect the platform version of ScheduledThreadPoolExecutor to be stable 
and correct (or hopefully correct; this implementation is more likely to be 
thoroughly reviewed compared to something custom). In fact, this alternative 
path is basically porting or reimplementing the core of Quartz, so there is 
some reference code to adapt there come to think of it.

On the file system side, this would require either more direct adoption of the 
NIO2 API rather than the original IO API, or we need a similar abstraction to 
make mocking a file system easier to accomplish. Then we can have separate 
integration tests that use a real filesystem that could be more predictable if 
necessary.

Suffice to say, there’s usually a reason why I get into refactoring in this 
project, and most of the time, it involves making it easier to test and extend 
things both in the project and as a downstream user. The rolling appender area 
is something I’d like to get to sometime, but we’ve got about ten years worth 
of development to untangle (which we’ve largely accomplished in the main branch 
already with the modules there, but we can still break out more modules that 
involve more complicated refactoring).

[0]: https://github.com/google/jimfs
—
Matt Sicker

> On Nov 20, 2023, at 08:40, Gary Gregory  wrote:
> 
> The GH CI builds on every push (as opposed to commi) IIRC.
> 
> Gary
> 
> On Mon, Nov 20, 2023, 9:34 AM Ralph Goers 
> wrote:
> 
>> Gary uses Windows as his development OS. He is probably the only one of us
>> who does. So he inevitably finds these issues before the rest of us.
>> 
>> I don’t know if the CI has a build that runs on Windows for every commit.
>> 
>> Ralph
>> 
>>> On Nov 20, 2023, at 6:12 AM, Robert Middleton 
>> wrote:
>>> 
>>> Are the tests run on Windows through Github workflows?  It doesn't
>>> look like it to me.
>>> 
>>> If you need access to a Windows machine, you can download a
>>> development VM straight from Microsoft:
>>> 
>> https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/
>>> 
>>> -Robert Middleton
>>> 
>>> On Mon, Nov 20, 2023 at 7:40 AM Apache 
>> wrote:
 
 In my experience they never get fixed. To be honest, when I was doing
>> the releases I would have these failures investigated to determine if it
>> was a trait problem vs a problem in the code being released. If it was the
>> latter I would cancel the vote. The only time tests should be disabled is
>> if we know it is a problem in the test but can’t figure out how to fix it.
 
 I also don’t ever recall Gary ever having more than one or two tests
>> fail in a run.
 
 Ralph
 
> On Nov 20, 2023, at 5:00 AM, Volkan Yazıcı  wrote:
> 
> I am not asking to disable Windows tests. I am asking to disable tests
> and only those tests that have a failure rate on Windows higher than,
> say, 30%. To be precise, I think there are 2-3 of them dealing with
> network sockets and rolling file appenders. I am not talking about
> dozens or such.
> 
> After disabling them, we can create a ticket referencing them. So that
> interested parties can fix them.
> 
>> On Mon, Nov 20, 2023 at 12:25 PM Piotr P. Karwasz
>>  wrote:
>> 
>> Hi Volkan,
>> 
>>> On Mon, 20 Nov 2023 at 09:36, Volkan Yazıcı  wrote:
>>> 
>>> As Gary (the only Windows user among the active Log4j maintainers,
>>> AFAIK) has noticed several times, Log4j tests on Windows are pretty
>>> unstable. It not only fails on Gary's laptop, but Piotr and I need to
>>> give Windows tests in CI a kick on a regular basis. Approximately one
>>> out of three CI runs fails on Windows. Piotr