Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Piotr P. Karwasz
Hi Matt,

On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
> During this bootstrapping, if the configuration location is available (such 
> as for a unit test), should LoggerContext set up the configuration provided? 
> Or is there some sort of cyclic dependency here preventing us from loading 
> ConfigurationFactory right away? So far, the only cyclic dependencies I’ve 
> found are related to plugins created in the DefaultConfiguration (or the 
> NullConfiguration in some cases), but those are already commented as such 
> (like in PatternLayout).

I think we should rely more on our `LifeCycle` abstraction:
`Configuration` starts in the "initializing" state and does not have
any subcomponents (especially those that require a `LoggerContext` to
be present) until `initialize()` is called.

We can do the same thing with `LoggerContext`: after the constructor
exits it does not have a configuration and is in the "initializing"
state. The configuration will be created on an `initialize()` call
(maybe chained from `start()`). By then we can safely pass a reference
to `LoggerContext` to the `AbstractConfiguration` constructor.

Logging does not occur until `start()` is called.

Piotr


Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Ralph Goers



Ralph

> On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  wrote:
> 
> Hi Matt,
> 
> On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
>> During this bootstrapping, if the configuration location is available (such 
>> as for a unit test), should LoggerContext set up the configuration provided? 
>> Or is there some sort of cyclic dependency here preventing us from loading 
>> ConfigurationFactory right away? So far, the only cyclic dependencies I’ve 
>> found are related to plugins created in the DefaultConfiguration (or the 
>> NullConfiguration in some cases), but those are already commented as such 
>> (like in PatternLayout).
> 
> I think we should rely more on our `LifeCycle` abstraction:
> `Configuration` starts in the "initializing" state and does not have
> any subcomponents (especially those that require a `LoggerContext` to
> be present) until `initialize()` is called.

Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as soon 
as it exists it has a Configuration. Once the Configuration identified from the 
configLocation is created the DefaultConfiguration is replaced.  This is the 
discussion Matt was having in his previous email and didn’t understand why it 
is required to work that way. Note that once the LoggerContext is constructed 
it will be “wired” by the ContextSelector and will be usable by any thread.

> 
> We can do the same thing with `LoggerContext`: after the constructor
> exits it does not have a configuration and is in the "initializing"
> state.

After the Constructor exists is has a DefaultConfiguration.

> The configuration will be created on an `initialize()` call
> (maybe chained from `start()`). By then we can safely pass a reference
> to `LoggerContext` to the `AbstractConfiguration` constructor.
> 
> Logging does not occur until `start()` is called.

I don’t believe this is true. Once the ContextSelector registers it any calls 
to log will use it.

Ralph

Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Ralph Goers


> On Dec 28, 2022, at 10:37 AM, Ralph Goers  wrote:
> 
> 
> 
> Ralph
> 
>> On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  
>> wrote:
>> 
>> Hi Matt,
>> 
>> On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
>>> During this bootstrapping, if the configuration location is available (such 
>>> as for a unit test), should LoggerContext set up the configuration 
>>> provided? Or is there some sort of cyclic dependency here preventing us 
>>> from loading ConfigurationFactory right away? So far, the only cyclic 
>>> dependencies I’ve found are related to plugins created in the 
>>> DefaultConfiguration (or the NullConfiguration in some cases), but those 
>>> are already commented as such (like in PatternLayout).
>> 
>> I think we should rely more on our `LifeCycle` abstraction:
>> `Configuration` starts in the "initializing" state and does not have
>> any subcomponents (especially those that require a `LoggerContext` to
>> be present) until `initialize()` is called.
> 
> Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as 
> soon as it exists it has a Configuration. Once the Configuration identified 
> from the configLocation is created the DefaultConfiguration is replaced.  
> This is the discussion Matt was having in his previous email and didn’t 
> understand why it is required to work that way. Note that once the 
> LoggerContext is constructed it will be “wired” by the ContextSelector and 
> will be usable by any thread.
> 
>> 
>> We can do the same thing with `LoggerContext`: after the constructor
>> exits it does not have a configuration and is in the "initializing"
>> state.
> 
> After the Constructor exists is has a DefaultConfiguration.
> 
>> The configuration will be created on an `initialize()` call
>> (maybe chained from `start()`). By then we can safely pass a reference
>> to `LoggerContext` to the `AbstractConfiguration` constructor.
>> 
>> Logging does not occur until `start()` is called.
> 
> I don’t believe this is true. Once the ContextSelector registers it any calls 
> to log will use it.
> 
> Ralph

Note that LoggerContext.start() uses tryLock. If initialization is in progress 
then start() will simply return. So other threads will log while initialization 
is occurring. This is intentional.

Ralph

Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Piotr P. Karwasz
Hi Ralph

On Wed, 28 Dec 2022 at 18:38, Ralph Goers  wrote:
> > On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  
> > wrote:
> > I think we should rely more on our `LifeCycle` abstraction:
> > `Configuration` starts in the "initializing" state and does not have
> > any subcomponents (especially those that require a `LoggerContext` to
> > be present) until `initialize()` is called.
>
> Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as 
> soon as it exists it has a Configuration. Once the Configuration identified 
> from the configLocation is created the DefaultConfiguration is replaced.  
> This is the discussion Matt was having in his previous email and didn’t 
> understand why it is required to work that way. Note that once the 
> LoggerContext is constructed it will be “wired” by the ContextSelector and 
> will be usable by any thread.

I was aware of that, but I always thought of the DefaultConfiguration
as an overly-expensive null checkguard.

> > Logging does not occur until `start()` is called.
>
> I don’t believe this is true. Once the ContextSelector registers it any calls 
> to log will use it.

You are right: the DefaultConfiguration is ready to log as soon as the
LoggerContext constructor exits. Thank you for the clarification.

Piotr


[log4j] The grand infra revamp

2022-12-28 Thread Volkan Yazıcı
Hello,

I want to share some updates from my side on what I have been working on
for the last couple of months. In particular, I expect certain changes to
have an ASF-wide impact! Sounds interesting? Continue reading.

*What is up with `maven-changes-plugin`?*

The current `site` phase takes ages to complete. This significantly hinders
releases and more frequent website updates. The main culprit is `site`
phase plugins need to be executed for 50+ modules. As my investigation
turned out, we can actually either drop or replace all such plugins, except
one: `maven-changes-plugin`, that is used to generate the changelog from
`changes.xml`. `maven-changes-plugin` is also a major source of
merge-conflicts in between feature branches and PRs, since they all need to
touch to the same file: `changes.xml`.

*Is it only about overhauling the `site` phase?*

No. As we agreed on, we want to migrate the issue tracking from JIRA to
GitHub Issues. `maven-changes-plugin` blocks us there too, since it only
supports a single issue tracking system.

*Enter `log4j-changelog`*

I have written simple Java command line applications to perform the
following tasks:

   1. Populate
   `src/site/changelog//_.xml` files
   from `changes.xml`
   2. Generate AsciiDoc-formatted changelog files from the populated
   `src/site/changelog`
   3. Make `src/site/changelog` ready before a release

These technical tools will not only help us to replace
`maven-changes-plugin`, support multiple issue trackers, and enable great
simplification of the `site` phase, due to its one-changelog-file-per-issue
convention, they will make changelog merge conflicts a thing of the past
too!

Okay, great! Since the `maven-changes-plugin` successor is there, what are
we exactly waiting for?

*Enter `log4j-tools`*

I have implemented `log4j-changelog` as a not-deployed Log4j module, though
Ralph insisted on not having this within Log4j to make it available for
other Log4j projects and since it needs to be shared by (and hence, sync'ed
in between) `release-2.x` and `master` branches. He proposed that we host
this in `log4j-tools` .
Since `log4j-tools` repository never had a release before, I let my
imagination go wild there:

   - Adopted the Maven-recommended way of setting up a BOM:
   `logging-parent` parents `log4j-bom`, which parents `log4j-parent`, which
   parents all other modules. There the effective POM of `log4j-bom` is
   stripped-down from its parent and all unnecessary weight via
   `flatten-maven-plugin` voodoo.
   - Switched to CI-friendly Maven versioning
   
   - Configured GitHub Actions to make releases!

(As you might guess, if experiment succeeds, I will port all these fancy
stuff to Log4j too.)

*Releasing... via GitHub Actions!?!*

ASF project artifacts are required to be signed and up until now that has
**always** been performed manually by release maintainers. Releasing **and
signing** artifacts in a fully-automated fashion in CI... "Surely You're
Joking, Mr. Feynman!" But I am not! I have shared a technical
implementation proposal in `members@`, got approval from ASF Security Team
head Mark J. Cox
, and now
I am waiting for INFRA-23996
 to be implemented!

What does this effectively mean? To the best of my knowledge, *`log4j-tools`
will be the first ASF project ever that is released via CI!* This will
enable various release automation enhancements, not to mention the software
supply chain traceability it will bring as well.

*Where are we now?*

Once INFRA-23996  is
implemented and `log4j-tools` (containing `log4j-changelog`) is released,
the following PR train will be merged:

   - #1145  – replace
   `maven-changes-plugin` with `log4j-changelog`
   - #1166  – simplify
   `site` goal (`./mvnw site` takes ~30 seconds!)
   - #1172  – update
   sources, docs, links, etc. to reflect the migration from JIRA to GitHub
   Issues

*What are we waiting for?*

We *_can_* release `log4j-tools` without INFRA-23996
, the old school way,
i.e., via running `./mvnw deploy` from a personal laptop. That said, I
believe releasing `log4j-tools` via CI will have a way bigger impact beyond
the scope of Log4j, and `log4j-tools` constitutes a great candidate for
such an experiment. We are all set: `log4j-tools` GitHub Actions workflows

are ready to make an automated release. Infra team has indicated in
INFRA-23996  that they
will implement it in a couple of days

Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Matt Sicker
Alright, let me clarify then:

Based on how things initialize in Log4j (at least in master, though probably 
similarly in 2.x), the creation of a LoggerContext is effectively an atomic 
operation, though it may be constructed more than once before returning a 
common singleton instance (per ClassLoader or whatever your strategy is). This 
is more obvious when checking a stack trace of a test that relies on LogManager 
instead of using a JUnit fixture, though it looks something like this:

LogManager::getLogger -> LogManager::getContext -> 
Log4jContextFactory::getContext -> ContextSelector::getContext -> [assuming 
ClassLoaderContextSelector: logic to look for existing LoggerContext; if none 
found] -> LoggerContext::new -> DefaultConfiguration::new

So while it’s not a strict blocking singleton (works more like Lazy::relaxed), 
only one LoggerContext instance is returned from the ContextSelector before 
it’s potentially started. In either case, by the time the LoggerContext has 
returned and we unwind from the call stack, we should actually have a started 
configuration by the time any log messages could possibly be processed. 
Usually, this is a DefaultConfiguration instance, but it seems like it _could_ 
be a proper Configuration instance, too, assuming we can load 
ConfigurationFactory plugins at this point.

I can’t really think of a way to violate this and cause log messages to be sent 
to the default configuration when a custom configuration is started right away. 
There may be a way to abuse reflection to set things up in an inconsistent 
state, but there seems to be appropriate happens-before relationships during 
initialization such that startup is not necessarily “blocking” in the I/O 
sense, but it’s blocking in the data dependencies in the program. As it is now, 
the DefaultConfiguration instance is created in the LoggerContext constructor; 
this same place could potentially load a ConfigurationFactory if a config 
location is specified (or just in general). I’m refactoring this area to use 
builder classes, too, so it doesn’t even have to physically be in the 
constructor.

—
Matt Sicker

> On Dec 28, 2022, at 01:52, Ralph Goers  wrote:
> 
> Applications can be multi-threaded. I believe we load the 
> DefaultConfiguration so other threads aren’t blocked by Log4j initialization. 
> The statement that “no logging can possibly occur until after at least said 
> construction” isn’t true.
> 
> Ralph
> 
>> On Dec 27, 2022, at 12:12 PM, Matt Sicker  wrote:
>> 
>> In this particular scenario, a file URI has been provided to the 
>> LoggerContext constructor. Given the fact that a Logger instance itself is 
>> not constructed until _after_ the LoggerContext has been constructed, no 
>> logging can possibly occur until at least after said construction. In our 
>> current implementation, we always load a DefaultConfiguration in 
>> LoggerContext before reconfiguring to use the real config.
>> 
>> This applies to any possible ContextSelector, by the way, since those 
>> control the creation of LoggerContext instances in the first place (unless 
>> you’re using something lower level to start things up like one of the 
>> LoggerContext constructors I’m referring to, but that’s always handled by 
>> the underlying ContextSelector in Core). The Java memory model guarantees 
>> that constructing a class is an atomic operation; the result of the 
>> constructor is safely published and available for any and all threads to 
>> consume. This is the main reason why using the `final` keyword is useful in 
>> Java: final fields in classes are the safely published state (assuming the 
>> underlying reference is an immutable or thread-safe structure of course), 
>> and volatile fields require memory fencing before doing any reads or writes 
>> to them in order for threads to all see the same state. In theory, we may be 
>> able to load a configuration during the constructor which would help avoid 
>> all those weird hacks in place by Spring Boot et al. for disabling logging 
>> until they can bootstrap themselves first.
>> —
>> Matt Sicker
>> 
>>> On Dec 19, 2022, at 08:11, Ralph Goers  wrote:
>>> 
>>> I am not really sure what the question here is. When you pass in a 
>>> configuration location you are going to end up with a LoggerContext that 
>>> references a Configuration derived from the information at that location. 
>>> Yes, it may create a DefaultConfiguration along the way but that is 
>>> primarily because logging may need to occur while Log4j is configuring. 
>>> 
>>> As for the external context, that was initially created to support binding 
>>> with things like Spring. However, the way Spring Boot ended up using it 
>>> caused some issues for adding extended Spring support so the LoggerContext 
>>> now contains a Map so that multiple “external” values can be captured. The 
>>> ExternalContext is just stored in a specific key. Given how the injector 
>>> worked the last time I looked at it I don’t really s

Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Matt Sicker
Using LifeCycle more would be neat, though its semantics aren’t well defined. 
In the Injector API, any no-arg method annotated with @Inject is considered 
equivalent to a @PostConstruct-style method which is similar to 
`LifeCycle::initialize`. The start() and stop() methods are more 
domain-specific so far, and any attempt at automating lifecycles along those 
would be somewhat complex without a complete understanding of the lifecycle of 
various subsystems. Generally speaking, the plugin and DI system have worked 
fairly well so far with allowing concepts to remain extremely modular such that 
working on, say, the rolling appender, has nothing to do with working on async 
logging or pattern conversion or lookups or anything else. LifeCycle is used ad 
hoc by many of these very independent concepts which would require a more 
detailed understanding of everything.

So, uh, maybe one day!
—
Matt Sicker

> On Dec 28, 2022, at 08:01, Piotr P. Karwasz  wrote:
> 
> Hi Matt,
> 
> On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
>> During this bootstrapping, if the configuration location is available (such 
>> as for a unit test), should LoggerContext set up the configuration provided? 
>> Or is there some sort of cyclic dependency here preventing us from loading 
>> ConfigurationFactory right away? So far, the only cyclic dependencies I’ve 
>> found are related to plugins created in the DefaultConfiguration (or the 
>> NullConfiguration in some cases), but those are already commented as such 
>> (like in PatternLayout).
> 
> I think we should rely more on our `LifeCycle` abstraction:
> `Configuration` starts in the "initializing" state and does not have
> any subcomponents (especially those that require a `LoggerContext` to
> be present) until `initialize()` is called.
> 
> We can do the same thing with `LoggerContext`: after the constructor
> exits it does not have a configuration and is in the "initializing"
> state. The configuration will be created on an `initialize()` call
> (maybe chained from `start()`). By then we can safely pass a reference
> to `LoggerContext` to the `AbstractConfiguration` constructor.
> 
> Logging does not occur until `start()` is called.
> 
> Piotr



Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Matt Sicker
And with this detail about the possibility for a Logger to be returned while 
LoggerContext::start is being executed is indeed a reason for why 
DefaultConfiguration _normally_ exists. I’m looking at two additional scenarios:

1. What if you provide a configuration location String or URI to the 
LoggerContext constructor? We already have a configuration source in this case 
(or at least we do if the URI corresponds to a local file), but we ignore this 
source until we start the LoggerContext. It is this situation where I thought 
it may be possible to start loading ConfigurationFactory plugins here to load 
and parse the specified configuration as the initial Configuration instance 
rather than an instance of DefaultConfiguration. This wasn’t previously 
possible due to various code dependencies, but given the fact that we have 
access to an initialized Injector instance inside the LoggerContext constructor 
(this may be a key detail I haven’t been too explicit about) means that we can 
start loading whatever subset of plugins we want at this point.

2. If you don’t provide a configuration location, we could still attempt to 
load the ConfigurationFactory plugins to bootstrap here. This is even more 
realistic now that we’ve removed the plugin package scanning feature, so we 
know that plugin service classes will already be available to load at this time.

What I’m thinking here is a fairly innocuous-looking change may allow us to 
load the configuration earlier. As it is now, the LoggerContext is getOrCreated 
by the ContextSelector before Log4jContextSelector checks its state to see if 
it should be started (in which case it will then load ConfigurationFactory 
plugins to start the real configuration). What I think might be possible is 
moving this ConfigurationFactory bit deeper into the call chain by creating it 
when the LoggerContext is being created. This would mean the LoggerContext 
would have the custom configuration instance _before_ LoggerContext::start is 
invoked.

One of the main tradeoffs here would be startup time I suppose. By deferring 
configuration loading until LoggerContext::start is invoked, we do indeed get 
out of the way as soon as possible to start accepting log messages and 
returning control to user code. So I suppose even if it turns out to be 
possible to load the configuration right away, this would by definition cause 
blocking as the configuration is loaded, so this would probably make sense as a 
configurable option.
—
Matt Sicker

> On Dec 28, 2022, at 11:44, Ralph Goers  wrote:
> 
> 
> 
>> On Dec 28, 2022, at 10:37 AM, Ralph Goers  wrote:
>> 
>> 
>> 
>> Ralph
>> 
>>> On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  
>>> wrote:
>>> 
>>> Hi Matt,
>>> 
>>> On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
 During this bootstrapping, if the configuration location is available 
 (such as for a unit test), should LoggerContext set up the configuration 
 provided? Or is there some sort of cyclic dependency here preventing us 
 from loading ConfigurationFactory right away? So far, the only cyclic 
 dependencies I’ve found are related to plugins created in the 
 DefaultConfiguration (or the NullConfiguration in some cases), but those 
 are already commented as such (like in PatternLayout).
>>> 
>>> I think we should rely more on our `LifeCycle` abstraction:
>>> `Configuration` starts in the "initializing" state and does not have
>>> any subcomponents (especially those that require a `LoggerContext` to
>>> be present) until `initialize()` is called.
>> 
>> Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as 
>> soon as it exists it has a Configuration. Once the Configuration identified 
>> from the configLocation is created the DefaultConfiguration is replaced.  
>> This is the discussion Matt was having in his previous email and didn’t 
>> understand why it is required to work that way. Note that once the 
>> LoggerContext is constructed it will be “wired” by the ContextSelector and 
>> will be usable by any thread.
>> 
>>> 
>>> We can do the same thing with `LoggerContext`: after the constructor
>>> exits it does not have a configuration and is in the "initializing"
>>> state.
>> 
>> After the Constructor exists is has a DefaultConfiguration.
>> 
>>> The configuration will be created on an `initialize()` call
>>> (maybe chained from `start()`). By then we can safely pass a reference
>>> to `LoggerContext` to the `AbstractConfiguration` constructor.
>>> 
>>> Logging does not occur until `start()` is called.
>> 
>> I don’t believe this is true. Once the ContextSelector registers it any 
>> calls to log will use it.
>> 
>> Ralph
> 
> Note that LoggerContext.start() uses tryLock. If initialization is in 
> progress then start() will simply return. So other threads will log while 
> initialization is occurring. This is intentional.
> 
> Ralph




Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Matt Sicker
It’s NullConfiguration that’s the overly-expensive null guard for preventing 
operations from blowing things up during shutdown. There’s a small period of 
time between the stopping of Log4j and the stopping of the JVM that other code 
could still be trying to send new log messages, so they need to go be processed 
by something basically. I’ve been working on a rewrite of NullConfiguration to 
be much lighter, but it would be fairly difficult to remove entirely without 
affecting log messages at shutdown.
—
Matt Sicker

> On Dec 28, 2022, at 14:02, Piotr P. Karwasz  wrote:
> 
> Hi Ralph
> 
> On Wed, 28 Dec 2022 at 18:38, Ralph Goers  wrote:
>>> On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  
>>> wrote:
>>> I think we should rely more on our `LifeCycle` abstraction:
>>> `Configuration` starts in the "initializing" state and does not have
>>> any subcomponents (especially those that require a `LoggerContext` to
>>> be present) until `initialize()` is called.
>> 
>> Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as 
>> soon as it exists it has a Configuration. Once the Configuration identified 
>> from the configLocation is created the DefaultConfiguration is replaced.  
>> This is the discussion Matt was having in his previous email and didn’t 
>> understand why it is required to work that way. Note that once the 
>> LoggerContext is constructed it will be “wired” by the ContextSelector and 
>> will be usable by any thread.
> 
> I was aware of that, but I always thought of the DefaultConfiguration
> as an overly-expensive null checkguard.
> 
>>> Logging does not occur until `start()` is called.
>> 
>> I don’t believe this is true. Once the ContextSelector registers it any 
>> calls to log will use it.
> 
> You are right: the DefaultConfiguration is ready to log as soon as the
> LoggerContext constructor exits. Thank you for the clarification.
> 
> Piotr



Re: [log4j] The grand infra revamp

2022-12-28 Thread Matt Sicker
This sounds amazing! Thanks for championing this effort! For signing, there’s 
some project out there that lets you use OIDC for signing things and keeping a 
sort of certificate transparency log so that you can verify that not only was 
an artifact signed by a valid signature, but it’s also done by the expected key 
for a particular release. See https://www.sigstore.dev/ for details. I’m not 
sure if it works with GPG keys or Maven or anything, but it could be a useful 
basis for further release automation in the future (even if it’s something more 
custom to the ASF).
—
Matt Sicker

> On Dec 28, 2022, at 14:42, Volkan Yazıcı  wrote:
> 
> Hello,
> 
> I want to share some updates from my side on what I have been working on
> for the last couple of months. In particular, I expect certain changes to
> have an ASF-wide impact! Sounds interesting? Continue reading.
> 
> *What is up with `maven-changes-plugin`?*
> 
> The current `site` phase takes ages to complete. This significantly hinders
> releases and more frequent website updates. The main culprit is `site`
> phase plugins need to be executed for 50+ modules. As my investigation
> turned out, we can actually either drop or replace all such plugins, except
> one: `maven-changes-plugin`, that is used to generate the changelog from
> `changes.xml`. `maven-changes-plugin` is also a major source of
> merge-conflicts in between feature branches and PRs, since they all need to
> touch to the same file: `changes.xml`.
> 
> *Is it only about overhauling the `site` phase?*
> 
> No. As we agreed on, we want to migrate the issue tracking from JIRA to
> GitHub Issues. `maven-changes-plugin` blocks us there too, since it only
> supports a single issue tracking system.
> 
> *Enter `log4j-changelog`*
> 
> I have written simple Java command line applications to perform the
> following tasks:
> 
>   1. Populate
>   `src/site/changelog//_.xml` files
>   from `changes.xml`
>   2. Generate AsciiDoc-formatted changelog files from the populated
>   `src/site/changelog`
>   3. Make `src/site/changelog` ready before a release
> 
> These technical tools will not only help us to replace
> `maven-changes-plugin`, support multiple issue trackers, and enable great
> simplification of the `site` phase, due to its one-changelog-file-per-issue
> convention, they will make changelog merge conflicts a thing of the past
> too!
> 
> Okay, great! Since the `maven-changes-plugin` successor is there, what are
> we exactly waiting for?
> 
> *Enter `log4j-tools`*
> 
> I have implemented `log4j-changelog` as a not-deployed Log4j module, though
> Ralph insisted on not having this within Log4j to make it available for
> other Log4j projects and since it needs to be shared by (and hence, sync'ed
> in between) `release-2.x` and `master` branches. He proposed that we host
> this in `log4j-tools` .
> Since `log4j-tools` repository never had a release before, I let my
> imagination go wild there:
> 
>   - Adopted the Maven-recommended way of setting up a BOM:
>   `logging-parent` parents `log4j-bom`, which parents `log4j-parent`, which
>   parents all other modules. There the effective POM of `log4j-bom` is
>   stripped-down from its parent and all unnecessary weight via
>   `flatten-maven-plugin` voodoo.
>   - Switched to CI-friendly Maven versioning
>   
>   - Configured GitHub Actions to make releases!
> 
> (As you might guess, if experiment succeeds, I will port all these fancy
> stuff to Log4j too.)
> 
> *Releasing... via GitHub Actions!?!*
> 
> ASF project artifacts are required to be signed and up until now that has
> **always** been performed manually by release maintainers. Releasing **and
> signing** artifacts in a fully-automated fashion in CI... "Surely You're
> Joking, Mr. Feynman!" But I am not! I have shared a technical
> implementation proposal in `members@`, got approval from ASF Security Team
> head Mark J. Cox
> , and now
> I am waiting for INFRA-23996
>  to be implemented!
> 
> What does this effectively mean? To the best of my knowledge, *`log4j-tools`
> will be the first ASF project ever that is released via CI!* This will
> enable various release automation enhancements, not to mention the software
> supply chain traceability it will bring as well.
> 
> *Where are we now?*
> 
> Once INFRA-23996  is
> implemented and `log4j-tools` (containing `log4j-changelog`) is released,
> the following PR train will be merged:
> 
>   - #1145  – replace
>   `maven-changes-plugin` with `log4j-changelog`
>   - #1166  – simplify
>   `site` goal (`./mvnw site` takes ~30 seconds!)
>   - #1172 

[log4j] New FlowMessageFactory and LogBuilder methods

2022-12-28 Thread Piotr P. Karwasz
Hi all,

Since a little more than a month I have been working on a bytecode
conversion tool that converts all logging API calls to chains of
`LogBuilder` calls with integrated location information.

The most complex API to convert was obviously Log4j2 API, with almost
400 methods to support. This is practically done[1], except method
references like `list.forEach(logger::info)` and two families of
methods: `traceEntry` and `traceExit`.

While it is possible to deal with them without modification to the
APIs, it would simplify the logic (especially for `traceEntry`) if we
add:

* a method `LogBuilder#logAndGet(Supplier)` that returns the
logged message or null if nothing was logged. This is the semantics of
`traceEntry`.
* a method `Logger#getFlowMessageFactory()` to return the flow message
factory used by the logger,
* move the logic that creates EntryMessage/ExitMessage from
AbstractLogger to FlowMessageFactory (3 new methods).

A concrete implementation of the proposal is in
https://github.com/apache/logging-log4j2/pull/1171. What do you think
about it?

Since the `LogBuilder` equivalent of `traceEntry` and `traceExit` is
much longer, I don't expect  anyone to employ these methods in their
source code.

Piotr

[1] https://github.com/apache/logging-log4j2/pull/1147


Re: [log4j] Question about the initial configuration created in LoggerContext

2022-12-28 Thread Ralph Goers
The reason a DefaultConfiguration is created and NOT the identified 
configuration is that a) we would have to block while the configuration is 
created and b) creating a Configuration isn’t necessarily very fast depending 
on what is in it (some of which we cannot control). We do want logging to be 
available quickly and only logging errors to the console during the 
initialization period seems like a fair compromise.  It is not at all 
inconceivable that an application will have started multiple threads that 
perform logging while logging is initializing and in general, we want Log4j to 
be as lock free as possible.

So your “fairly innocuous” change is not something I would support as I 
explicitly rejected that idea when I implemented it in the first place. I am 
also not in favor of a configuration option that makes the start up logic even 
more complicated and fragile.

Ralph

> On Dec 28, 2022, at 6:23 PM, Matt Sicker  wrote:
> 
> And with this detail about the possibility for a Logger to be returned while 
> LoggerContext::start is being executed is indeed a reason for why 
> DefaultConfiguration _normally_ exists. I’m looking at two additional 
> scenarios:
> 
> 1. What if you provide a configuration location String or URI to the 
> LoggerContext constructor? We already have a configuration source in this 
> case (or at least we do if the URI corresponds to a local file), but we 
> ignore this source until we start the LoggerContext. It is this situation 
> where I thought it may be possible to start loading ConfigurationFactory 
> plugins here to load and parse the specified configuration as the initial 
> Configuration instance rather than an instance of DefaultConfiguration. This 
> wasn’t previously possible due to various code dependencies, but given the 
> fact that we have access to an initialized Injector instance inside the 
> LoggerContext constructor (this may be a key detail I haven’t been too 
> explicit about) means that we can start loading whatever subset of plugins we 
> want at this point.
> 
> 2. If you don’t provide a configuration location, we could still attempt to 
> load the ConfigurationFactory plugins to bootstrap here. This is even more 
> realistic now that we’ve removed the plugin package scanning feature, so we 
> know that plugin service classes will already be available to load at this 
> time.
> 
> What I’m thinking here is a fairly innocuous-looking change may allow us to 
> load the configuration earlier. As it is now, the LoggerContext is 
> getOrCreated by the ContextSelector before Log4jContextSelector checks its 
> state to see if it should be started (in which case it will then load 
> ConfigurationFactory plugins to start the real configuration). What I think 
> might be possible is moving this ConfigurationFactory bit deeper into the 
> call chain by creating it when the LoggerContext is being created. This would 
> mean the LoggerContext would have the custom configuration instance _before_ 
> LoggerContext::start is invoked.
> 
> One of the main tradeoffs here would be startup time I suppose. By deferring 
> configuration loading until LoggerContext::start is invoked, we do indeed get 
> out of the way as soon as possible to start accepting log messages and 
> returning control to user code. So I suppose even if it turns out to be 
> possible to load the configuration right away, this would by definition cause 
> blocking as the configuration is loaded, so this would probably make sense as 
> a configurable option.
> —
> Matt Sicker
> 
>> On Dec 28, 2022, at 11:44, Ralph Goers  wrote:
>> 
>> 
>> 
>>> On Dec 28, 2022, at 10:37 AM, Ralph Goers  
>>> wrote:
>>> 
>>> 
>>> 
>>> Ralph
>>> 
 On Dec 28, 2022, at 7:01 AM, Piotr P. Karwasz  
 wrote:
 
 Hi Matt,
 
 On Sun, 18 Dec 2022 at 20:30, Matt Sicker  wrote:
> During this bootstrapping, if the configuration location is available 
> (such as for a unit test), should LoggerContext set up the configuration 
> provided? Or is there some sort of cyclic dependency here preventing us 
> from loading ConfigurationFactory right away? So far, the only cyclic 
> dependencies I’ve found are related to plugins created in the 
> DefaultConfiguration (or the NullConfiguration in some cases), but those 
> are already commented as such (like in PatternLayout).
 
 I think we should rely more on our `LifeCycle` abstraction:
 `Configuration` starts in the "initializing" state and does not have
 any subcomponents (especially those that require a `LoggerContext` to
 be present) until `initialize()` is called.
>>> 
>>> Actually, a LoggerContext is hard-wired with a DefaultConfiguration. So as 
>>> soon as it exists it has a Configuration. Once the Configuration identified 
>>> from the configLocation is created the DefaultConfiguration is replaced.  
>>> This is the discussion Matt was having in his previous email and didn’t 
>>> understand why it is required t

Re: [log4j] The grand infra revamp

2022-12-28 Thread Ralph Goers
Volkan,

Please do not wait to release log4j-tools.  I am already holding off adding 
more changelog entries so as to avoid creating more work when the PR is merged. 
 As far as I can tell there is no requirement that log4j-tools MUST be released 
via CI.

I really would have liked to have all of this done already as this would have 
been an ideal time to release 2.19.1.

Ralph

> On Dec 28, 2022, at 6:34 PM, Matt Sicker  wrote:
> 
> This sounds amazing! Thanks for championing this effort! For signing, there’s 
> some project out there that lets you use OIDC for signing things and keeping 
> a sort of certificate transparency log so that you can verify that not only 
> was an artifact signed by a valid signature, but it’s also done by the 
> expected key for a particular release. See https://www.sigstore.dev/ for 
> details. I’m not sure if it works with GPG keys or Maven or anything, but it 
> could be a useful basis for further release automation in the future (even if 
> it’s something more custom to the ASF).
> —
> Matt Sicker
> 
>> On Dec 28, 2022, at 14:42, Volkan Yazıcı  wrote:
>> 
>> Hello,
>> 
>> I want to share some updates from my side on what I have been working on
>> for the last couple of months. In particular, I expect certain changes to
>> have an ASF-wide impact! Sounds interesting? Continue reading.
>> 
>> *What is up with `maven-changes-plugin`?*
>> 
>> The current `site` phase takes ages to complete. This significantly hinders
>> releases and more frequent website updates. The main culprit is `site`
>> phase plugins need to be executed for 50+ modules. As my investigation
>> turned out, we can actually either drop or replace all such plugins, except
>> one: `maven-changes-plugin`, that is used to generate the changelog from
>> `changes.xml`. `maven-changes-plugin` is also a major source of
>> merge-conflicts in between feature branches and PRs, since they all need to
>> touch to the same file: `changes.xml`.
>> 
>> *Is it only about overhauling the `site` phase?*
>> 
>> No. As we agreed on, we want to migrate the issue tracking from JIRA to
>> GitHub Issues. `maven-changes-plugin` blocks us there too, since it only
>> supports a single issue tracking system.
>> 
>> *Enter `log4j-changelog`*
>> 
>> I have written simple Java command line applications to perform the
>> following tasks:
>> 
>>  1. Populate
>>  `src/site/changelog//_.xml` files
>>  from `changes.xml`
>>  2. Generate AsciiDoc-formatted changelog files from the populated
>>  `src/site/changelog`
>>  3. Make `src/site/changelog` ready before a release
>> 
>> These technical tools will not only help us to replace
>> `maven-changes-plugin`, support multiple issue trackers, and enable great
>> simplification of the `site` phase, due to its one-changelog-file-per-issue
>> convention, they will make changelog merge conflicts a thing of the past
>> too!
>> 
>> Okay, great! Since the `maven-changes-plugin` successor is there, what are
>> we exactly waiting for?
>> 
>> *Enter `log4j-tools`*
>> 
>> I have implemented `log4j-changelog` as a not-deployed Log4j module, though
>> Ralph insisted on not having this within Log4j to make it available for
>> other Log4j projects and since it needs to be shared by (and hence, sync'ed
>> in between) `release-2.x` and `master` branches. He proposed that we host
>> this in `log4j-tools` .
>> Since `log4j-tools` repository never had a release before, I let my
>> imagination go wild there:
>> 
>>  - Adopted the Maven-recommended way of setting up a BOM:
>>  `logging-parent` parents `log4j-bom`, which parents `log4j-parent`, which
>>  parents all other modules. There the effective POM of `log4j-bom` is
>>  stripped-down from its parent and all unnecessary weight via
>>  `flatten-maven-plugin` voodoo.
>>  - Switched to CI-friendly Maven versioning
>>  
>>  - Configured GitHub Actions to make releases!
>> 
>> (As you might guess, if experiment succeeds, I will port all these fancy
>> stuff to Log4j too.)
>> 
>> *Releasing... via GitHub Actions!?!*
>> 
>> ASF project artifacts are required to be signed and up until now that has
>> **always** been performed manually by release maintainers. Releasing **and
>> signing** artifacts in a fully-automated fashion in CI... "Surely You're
>> Joking, Mr. Feynman!" But I am not! I have shared a technical
>> implementation proposal in `members@`, got approval from ASF Security Team
>> head Mark J. Cox
>> , and now
>> I am waiting for INFRA-23996
>>  to be implemented!
>> 
>> What does this effectively mean? To the best of my knowledge, *`log4j-tools`
>> will be the first ASF project ever that is released via CI!* This will
>> enable various release automation enhancements, not to mention the software
>> supply chain traceability it will bring as well.
>> 
>>