Re: Signaling configuration errors

2025-03-03 Thread Jeff Thomas
Hi Piotr/Gary,

I don't know if this answers the "transaction" question - but a little 
input...as the Germans here say "giving my added mustard." 🙂

For my recent changes - adding builders to Filters replacing old factory 
methods I have been doing the following.

  1.
The single private constructor of a configuration class takes its own Builder 
as single argument.

The constructor throws unchecked exceptions on misconfiguration (i.e. 
IllegalArgumentException).  These should have already been validated in the 
builder but there are safety checks - for example, around compiling a Pattern 
from a Regex string.

But this constructor is only called by the Builder#build() method (or a 
subclass constructor via super).

This also has the advantage of being able to cleanly call 'super(builder)' - 
for example, in the Filter implementations to pass the builder up to 
AbstractFilter to pick up the onMatch/onMismatch attributes.

There are currently also multiple locations where configuration problems are 
not guarded (i.e. Integer/Long from String - the NumberFormatExceptions get 
thrown unchecked). For example, XmlConfiguration constructor "shutdownTimeout" 
atttribute.

  2.
The Builder build() method validates its state - logging errors to the 
StatusLogger and returning null on misconfiguration.

The call to the constructor is wrapped in try/catch so that any unchecked 
exceptions from #1 are caught and logged - then null is returned.

I have also been incrementally adding JVerify nullability annotations (per 
Piotr).  If you don't perform the explicit checks the code can light up like a 
Christmas tree in the IDE.  This happens quite often because the Builder 
'@PluginAttribute's are '@Nullable' by design, but they should sometimes be 
@NonNull by the time they get to the constructor. 😕 For this reason, the 
recently implemented Builder 'isValid()' method doesn't work very well with the 
nullability annotations - there is no inferred contract.

One thing to consider (if I haven't interpreted it incorrectly).  I believe the 
current implementation of a CompositeConfiguration discards a configuration 
that fails during reconfiguration (including its configuration-source).  So for 
example, if a XML configuration does an all-or-nothing approach, or is 
temporary unavailable due to a file-lock or a network issue the configuration 
is discarded - a subsequent reload has no more information about the failed 
configuration - it is simply gone.

Cheers, Jeff


From: Gary Gregory 
Sent: Monday, March 3, 2025 2:39 PM
To: Apache Logging Developers List 
Subject: Re: Signaling configuration errors

I like the idea of a transactional configuration (all or nothing).
Internally I think we should use a custom unchecked runtime exception which
would be logged by us from our top level API.

Gary

On Mon, Mar 3, 2025, 06:54 Piotr P. Karwasz 
wrote:

> Hi all,
>
> Configuration problems in our builders are handled in multiple ways:
>
> 1. Sometimes we call `LOGGER.error` and return `null`.
>
> 2. Sometimes we throw an explicit exception. The types of exceptions
> vary: NPE, IllegalArgumentException, IllegalStateException, etc. There
> is a `o.a.l.l.core.config.ConfigurationException`, but it doesn't seem
> to be used often.
>
> 3. Sometimes exceptions, usually NPEs, are thrown by the constructor of
> a plugin. Most of the time I consider these bugs.
>
> Could we agree on a preferred way to signal configuration errors? I
> don't plan to modify the behavior of old code, but we should apply a
> consistent strategy on new builders.
>
> I do like throwing exceptions, which often simplifies flow control, but
> when I started working on Log4j, I was told 1. was the way to go.
>
> On a related matter, Ralph said on Slack that:
>
>  > Failure during configuration should be caught and cause the previous
> configuration to remain intact.
>
> While I agree with this "all-or-nothing" strategy, this is not the way
> it currently works. Regardless of the strategy chosen to communicate the
> problem, a misconfigured component will simply be omitted. This can
> cause problems down the line, like:
>
> 2025-03-03T11:49:26.081922076Z main ERROR RollingFileAppender 'ROLLING':
> When no file name is provided a DirectFileRolloverStrategy must be
> configured
> 2025-03-03T11:49:26.082046726Z main ERROR Null object returned for
> RollingFile in Appenders.
> 2025-03-03T11:49:26.083912635Z main ERROR Unable to locate appender
> "ROLLING" for logger config "root"
>
> However, these problems are not fatal and the configuration replaces the
> previous one.
>
> What do you think?
>
> Piotr
>
>


Question: "Garbage-Free" AbstractFilter#filter(...) methods.

2025-03-01 Thread Jeff Thomas
Hello Dev Team,

I am a totally fresh "committer" so forgive me if I ask a silly question.

I got into a discussion with Piotr about the Log4j Filters and the overloaded 
filter methods (14 in total) and asked if the extra API is necessary when there 
is a varargs "filter(, final Object ... params)" method.  

He explained that this the "garbage-free" approach, and that it avoids the 
creation of the temporary varargs object array.

I noticed however that all but one of the MessageFactory2 implementations 
(ReusableMessageFactory) ultimately calls some varargs method anyways at the 
tail end of the call chain.

I asked if the situation hasn't changed since Log4j 3.x targets JDK 17 and 
there have been major improvements in garbage-collection.  Ultimately this is 
*a lot* of boilerplate code since all are implemented in every concrete Filter 
implementation.

Piotr suggested I ask on this mailing list what the general opinion is.

A few comments from our discussion:
* "Garbage-free semantics were important for the financial industry 10 years 
ago. I am not sure if they still are."
* "the size of those methods considerably bloats log4j-api"
* "We recently revamped the performance page: 
Profiting from the occasion we added warnings that "async" and "garbage-free" 
doesn't necessarily mean better performance"
* "Garbage-free" does not mean less memory consumption. Users ask: 'I enabled 
garbage-free mode, how come I have 200 MiB more memory usage?''"
* "a performance specialist at Amazon) told us that memory allocation is 
painfully slow on AWS Graviton CPUs, so garbage-free is nice there." (I found 
this 2022 article about this: 
https://aws.amazon.com/blogs/big-data/understanding-the-jvmmemorypressure-metric-changes-in-amazon-opensearch-service/)
* "We all have the same question."

Thoughts?

Best regards,
Jeff Thomas

Note: I also had a discussion with my AI tool ***which never makes mistakes*** 
🤐and it argued the following:

Historical Importance of Garbage-Free Semantics:
* 10+ years ago, the financial industry and other performance-critical fields 
(e.g., gaming, telemetry, high-frequency trading) placed significant emphasis 
on garbage-free semantics because older garbage collectors (like CMS) struggled 
to deal with short-lived allocations efficiently.
* For example, creating Object[] allocations for each log event could overwhelm 
the GC in high-throughput applications, leading to unpredictable pauses.
* The focus back then was mitigating GC pressure by avoiding temporary garbage 
altogether in performance-critical paths.

Today:
* Modern GCs (e.g., G1GC, ZGC, Shenandoah) have evolved significantly to handle 
short-lived garbage extremely efficiently, as it resides in the "Young 
Generation" and is quickly reclaimed.
* Garbage-free semantics usually don't improve latency or throughput 
significantly anymore unless the application is running in a very specialized 
environment (e.g., AWS Graviton CPUs with slower allocation mechanics) or has 
custom performance constraints.

Bottom Line:
* The importance of garbage-free semantics has diminished in most applications, 
especially for typical logging use cases, but specific environments (e.g., AWS 
Graviton or highly latency-sensitive systems) may still care deeply.