Breaking changes in Log4j API 3.x

2023-10-09 Thread Piotr P. Karwasz
Hi all,

We have often declared that 3.x will **not** constitute a major
version for Log4j API and that everything that used to work with 2.x
will work with 3.x (even provider code).

However that statement does not apply in practice, since some breaking
changes **were** introduced e.g. in the `util` subpackage (cf. [1]
e.g.), which is marked as internal, but:
 * in practice it is often used by plugin providers,
 * has some classes (like MessageSupplier) that are actually used by
consumers of the API.

That is why I would propose to revise the statement about compatibility:
 * the main,`message` and `status` packages should be 100% compatible
with the previous version,
 * the `spi` package should be as much compatible as possible,
 * the `simple` package should be internal and we can do with it
anything we please,
 * the `util` package should keep the types used by other packages and
all other classes should be moved to `util.internal`.

I made an experiment[2] to see how many classes we need to keep in
`util`. It turns out we just need to keep:
 * BiConsumer, IndexedReadOnlyStringMap, MessageSupplier,
MultiFormatStringBuilderFormattable, ReadOnlyStringMap,
StringBuilderFormattable, Supplier and TriConsumer to prevent breaking
changes in the main and `message` packages,
 * StringMap to prevent changes in the `spi` package,
 * we could keep `Constants` to prevent some `2.x` plugins from breaking.

What do you think?

Piotr

[1] https://github.com/apache/logging-log4j2/pull/1586
[2] https://github.com/ppkarwasz/logging-log4j2/tree/clean-break


Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Gary Gregory
I think it is OK to break compatibility in a major version. I can't believe
we'd ship classes that are duplicates of Java classes like BiConsumer.

Gary

On Mon, Oct 9, 2023, 4:17 AM Piotr P. Karwasz 
wrote:

> Hi all,
>
> We have often declared that 3.x will **not** constitute a major
> version for Log4j API and that everything that used to work with 2.x
> will work with 3.x (even provider code).
>
> However that statement does not apply in practice, since some breaking
> changes **were** introduced e.g. in the `util` subpackage (cf. [1]
> e.g.), which is marked as internal, but:
>  * in practice it is often used by plugin providers,
>  * has some classes (like MessageSupplier) that are actually used by
> consumers of the API.
>
> That is why I would propose to revise the statement about compatibility:
>  * the main,`message` and `status` packages should be 100% compatible
> with the previous version,
>  * the `spi` package should be as much compatible as possible,
>  * the `simple` package should be internal and we can do with it
> anything we please,
>  * the `util` package should keep the types used by other packages and
> all other classes should be moved to `util.internal`.
>
> I made an experiment[2] to see how many classes we need to keep in
> `util`. It turns out we just need to keep:
>  * BiConsumer, IndexedReadOnlyStringMap, MessageSupplier,
> MultiFormatStringBuilderFormattable, ReadOnlyStringMap,
> StringBuilderFormattable, Supplier and TriConsumer to prevent breaking
> changes in the main and `message` packages,
>  * StringMap to prevent changes in the `spi` package,
>  * we could keep `Constants` to prevent some `2.x` plugins from breaking.
>
> What do you think?
>
> Piotr
>
> [1] https://github.com/apache/logging-log4j2/pull/1586
> [2] https://github.com/ppkarwasz/logging-log4j2/tree/clean-break
>


RE: [log4j] Improving log4j security

2023-10-09 Thread Klebanov, Vladimir
Thanks, Piotr. I don't know what happened to your replies (maybe the spam 
filter dropped them), but I am happy that we recovered from that now.

Log injections are definitely security issues, but if you prefer to talk about 
them in the open, I will follow suit.

For context: a log injection occurs when an application logs user-supplied data 
(which is often the case). Attacker can exploit log injection to forge log 
records and impede forensics or exploit potential vulnerabilities in 
log-processing systems. There is a variety of string classes that attackers can 
try to inject, including newlines, ANSI sequences, Unicode direction markers, 
Unicode homographs, JavaScript, PHP, etc.

Ideally, applications defend against log injection attacks by encoding (aka 
escaping) user-supplied data before logging. The specific encoding depends on 
the desired level of protection. URL-encoding, for instance, would protect 
against all of the above-mentioned attack classes, but weaker encodings may be 
sometimes acceptable as well.

A natural place to implement encoding is in the pattern layout configuration. 
Some encoding pattern converters are already available in log4j, but there are 
still gaps that I would like to help fill. I think there are roughly three of 
them:

1. The documentation should more prominently explain the issue. Today, most 
users would probably think that the following layout is HTML-safe, while it's 
not:


2. The HTML encoder is not always sufficient. I would like to see an addition 
of a stricter one, such as a URL-encoder.

3. The current encoders encode all structured data (like the complete exception 
stacktrace) and not just the injection-prone parts (i.e., the exception 
message). This means I cannot replace the insecure layout above with the secure 
layout



without changing how logs are parsed (as the stack frames will not be separated 
by newlines anymore).

I have created a PoC implementation of an improved encoder, but I would 
obviously need help to make it productive. Is anyone here interested in that? 
Questions and comments are welcome as well.

Thanks,
Vladimir


-Original Message-
From: Piotr P. Karwasz  
Sent: Thursday, 5 October 2023 22:06
To: dev@logging.apache.org; Klebanov, Vladimir 
Subject: Re: [log4j] Improving log4j security

[You don't often get email from piotr.karw...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi Vladimir,

On Thu, 5 Oct 2023 at 21:47, Klebanov, Vladimir
 wrote:
> I would like to contribute some code in order to make log4j usage more 
> secure. I have now sent two emails to the log4j security team but did not 
> receive a response. Is anybody here interested? How can we discuss this 
> further?

Both times (10 Aug 2023, 23:19 and 29 Aug 2023, 20:49) we sent an
answer to your address at sap.com.

Anyway the general consensus was that the issue with generating HTML
using PatternLayout does not constitute a security problem and you can
discuss it on this mailing list or file an issue in Github issues.

Piotr


Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Ralph Goers
I think you are on the right track. We have to think of the main use case where 
a break in compatibility would cause a problem - an application uses libraries 
compiled with Log4j 2.x.

I am much less concerned about custom plugins as presumably the user has some 
control over them. That said, I would hope the vast majority of plugins from 
third parties would continue to work.

We cannot put users in a position where they cannot upgrade until all their 
dependencies do.

Note that Spring Boot builds with Log4j 2.x. It needs access to 
PerformanceSensitive, PropertiesUtil, and PropertySource.

Ralph

> On Oct 9, 2023, at 1:17 AM, Piotr P. Karwasz  wrote:
> 
> Hi all,
> 
> We have often declared that 3.x will **not** constitute a major
> version for Log4j API and that everything that used to work with 2.x
> will work with 3.x (even provider code).
> 
> However that statement does not apply in practice, since some breaking
> changes **were** introduced e.g. in the `util` subpackage (cf. [1]
> e.g.), which is marked as internal, but:
> * in practice it is often used by plugin providers,
> * has some classes (like MessageSupplier) that are actually used by
> consumers of the API.
> 
> That is why I would propose to revise the statement about compatibility:
> * the main,`message` and `status` packages should be 100% compatible
> with the previous version,
> * the `spi` package should be as much compatible as possible,
> * the `simple` package should be internal and we can do with it
> anything we please,
> * the `util` package should keep the types used by other packages and
> all other classes should be moved to `util.internal`.
> 
> I made an experiment[2] to see how many classes we need to keep in
> `util`. It turns out we just need to keep:
> * BiConsumer, IndexedReadOnlyStringMap, MessageSupplier,
> MultiFormatStringBuilderFormattable, ReadOnlyStringMap,
> StringBuilderFormattable, Supplier and TriConsumer to prevent breaking
> changes in the main and `message` packages,
> * StringMap to prevent changes in the `spi` package,
> * we could keep `Constants` to prevent some `2.x` plugins from breaking.
> 
> What do you think?
> 
> Piotr
> 
> [1] https://github.com/apache/logging-log4j2/pull/1586
> [2] https://github.com/ppkarwasz/logging-log4j2/tree/clean-break



[all] Sonarcloud

2023-10-09 Thread Christian Grobmeier
Hi,

I added a fork of Chainsaw to Sonarcloud, which I find very helpful:
https://sonarcloud.io/summary/overall?id=grobmeier_logging-chainsaw

I asked Infra, and they would add projects by request to Sonarcloud. 

If you are interested, I'd like to add that to our repos.
If not, it is fine because I can run this locally and using my fork - nothing 
critical to have.

Let me know :)

Kind regards,
Christian


Re: [chainsaw] What is necessary for a 2.2 release?

2023-10-09 Thread Christian Grobmeier
On Sun, Oct 8, 2023, at 23:19, Scott Deboy wrote:
> I started but haven't had much time this week. The UI updates driven by
> settings changes are most of what I have left.

OK great to hear, in that case I hold myself back a little longer :)

Thanks!

>
> On Sun, Oct 8, 2023, 2:17 PM Christian Grobmeier 
> wrote:
>
>> geson seems to do a good job, like Jackson (same JSR).
>> I'd like to move forward with JSON format for storing properties.
>>
>> I am not sure what the status currently is, if Scott is still working on
>> it :) I could also make some kind of proposal or so for storing properties
>>
>> On Tue, Oct 3, 2023, at 01:20, Scott Deboy wrote:
>> > I think persisting to json format makes sense/would be easy to consume
>> > with tools if needed.
>> >
>> > On 10/2/23, Robert Middleton  wrote:
>> >>> OK. Do you think something based on Jackson would be good? It's JSON
>> >>> though.
>> >>
>> >> We already have a dependency on genson for JSON parsing, so if we
>> >> really want to use JSON that would make the most sense.  Commons
>> >> config can read/write XML; currently I just have it configured to do
>> >> properties files.  I think we can write our own reader/writer as well,
>> >> but ultimately I don't think that there is anything special that we
>> >> need that commons config doesn't provide us out of the box.
>> >>
>> >> -Robert Middleton
>> >>
>> >> On Mon, Oct 2, 2023 at 5:14 PM Matt Sicker  wrote:
>> >>>
>> >>> Jackson makes for a good library here because it also supports several
>> >>> data formats (YAML, XML, HOCON, and all sorts of others, both textual
>> and
>> >>> binary) and is fairly extensible for hooking any custom serialization
>> or
>> >>> deserialization logic you need. Given the desire to avoid Java
>> >>> serialization, it is perfectly reasonable to avoid XStream as that’s
>> >>> basically Java serialization using XML with all the pitfalls that
>> entails
>> >>> (I dealt with this fairly extensively back in the Jenkins project which
>> >>> used an old fork of XStream for managing config files and state).
>> >>>
>> >>> I haven’t used Commons Configuration before, but it seems fairly
>> >>> appropriate for this sort of thing as well.
>> >>>
>> >>> > On Oct 2, 2023, at 1:43 PM, Christian Grobmeier <
>> grobme...@apache.org>
>> >>> > wrote:
>> >>> >
>> >>> > On Mon, Oct 2, 2023, at 16:15, Robert Middleton wrote:
>> >>> >> At least two reasons I can think of:
>> >>> >> 1. Xstream doesn’t work on all java versions as you saw
>> >>> >> 2. Simplify by using the commons config instead of rolling our own.
>> >>> >>
>> >>> >> Each tab should now have just one config file, the idea being that
>> you
>> >>> >> can
>> >>> >> now share config files between people.  Before each tab had at least
>> >>> >> two(maybe three) files.  One was the xml config, one was a java
>> >>> >> serialized
>> >>> >> map.  Removing the java serialization is important.
>> >>> >
>> >>> > OK. Do you think something based on Jackson would be good? It's JSON
>> >>> > though.
>> >>> >
>> >>> > From an example:
>> >>> >
>> >>> > ObjectMapper objectMapper = new ObjectMapper();
>> >>> > Car car = new Car("yellow", "renault");
>> >>> > objectMapper.writeValue(new File("target/car.json"), car);
>> >>> >
>> >>> > We could wrap this in some kind of class and make it accessible per
>> >>> > "tab" (or whatever).
>> >>> >
>> >>> >
>> >>> >>
>> >>> >> -Robert Middleton
>> >>> >>
>> >>> >> On Mon, Oct 2, 2023 at 6:12 AM Christian Grobmeier
>> >>> >> 
>> >>> >> wrote:
>> >>> >>
>> >>> >>>
>> >>> >>> On Mon, Oct 2, 2023, at 02:55, Robert Middleton wrote:
>> >>>  Some(most?) of the settings should be saved:
>> >>> 
>> >>> >>>
>> https://github.com/apache/logging-chainsaw/blob/5ccb3c8e55dffd4361c549c3bcdac3f3675f79e5/src/main/java/org/apache/log4j/chainsaw/prefs/SettingsManager.java#L191
>> >>> 
>> >>>  The stuff that is commented out should just be the old saving code
>> >>>  that
>> >>>  used XStream to save the data out.
>> >>> >>>
>> >>> >>> You made it using this commit (thank you, its easy to track):
>> >>> >>>
>> >>> >>>
>> https://github.com/apache/logging-chainsaw/commit/75bedf98665188eef4d13e4bfbb4b0dae456f65e
>> >>> >>>
>> >>> >>> One question: why did we remove Xstream? it seem there was an
>> update
>> >>> >>> late
>> >>> >>> 2022 to address some security.
>> >>> >>> Should we just re-enable it or was there other reasons too?
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> 
>> >>>  -Robert Middleton
>> >>> 
>> >>>  On Sun, Oct 1, 2023 at 3:39 PM Christian Grobmeier
>> >>>  > >>> 
>> >>>  wrote:
>> >>> 
>> >>> >
>> >>> > On Sun, Oct 1, 2023, at 21:28, Scott Deboy wrote:
>> >>> >> The ability to route events to tabs is a core feature in the
>> code
>> >>> >> -
>> >>> >> that's how Chainsaw log messages end up in a Chainsaw-specific
>> tab
>> >>> >> -
>> >>> >> but the ability to control that routing via a 'routing
>> expression'
>> >>> 

Re: [all] Sonarcloud

2023-10-09 Thread Piotr P. Karwasz
Hi Christian,

On Mon, 9 Oct 2023 at 21:24, Christian Grobmeier  wrote:
> I asked Infra, and they would add projects by request to Sonarcloud.
>
> If you are interested, I'd like to add that to our repos.

Sounds good to me. Right now I am checking the major "bugs" found by
SpotBugs, but using multiple static analysis tools does not hurt.

Piotr


Re: [log4j] Improving log4j security

2023-10-09 Thread Volkan Yazıcı
*[I am sharing my earlier response (almost) verbatim below.]*

I would like to address your both old and the most recent email *myself* –
that is, it only reflects my personal view, and not of the PMC.

>  A HTML-safe layout is only achieved if

> defined akin to:

>

>  `.


> Would Log4j be willing to improve the usability of encoding in pattern
> layouts to make it less likely for users to shoot themselves in the foot?


We provide best in the class support for JSON, HTML, etc. with their
associated dedicated layouts. If users insist on using `PatternLayout` for
those purposes, it feels to me somebody is stubbornly trying to pass SQL
arguments with string concatenation.


Nevertheless, if you have any proposals on _"improving the usability of
encoding in pattern layouts to make it less likely for users to shoot
themselves in the foot"_, you are more than welcome! The entire Log4j crew
would be happy to assist you for such contributions.


> I did go ahead and create a proof-of-concept encoder for

> log4j that securely encodes exceptions without completely

> mangling the stack traces:

>

> https://github.com/vlkl-sap/log4j-encoder

>

> There are two different implementations of the encoder with

> different trade-offs (to be discussed). I also implemented a

> new, more encompassing text encoder, based on URL

> encoding, but this aspect is independent.


Before writing any code, would you mind helping us with the following
questions, please?


   1. Do you have a use case? If so, where does `HtmlLayout` fall short of
   addressing it?
   2. Assuming `HtmlLayout` doesn't address your needs, can we [in a
   backward-compatible manner] improve `HtmlLayout` to make it work for you?
   3. Can we [in a backward-compatible manner] incorporate your
   `PatternLayout` changes?

Kind regards.


On Mon, Oct 9, 2023 at 5:24 PM Klebanov, Vladimir
 wrote:

> Thanks, Piotr. I don't know what happened to your replies (maybe the spam
> filter dropped them), but I am happy that we recovered from that now.
>
> Log injections are definitely security issues, but if you prefer to talk
> about them in the open, I will follow suit.
>
> For context: a log injection occurs when an application logs user-supplied
> data (which is often the case). Attacker can exploit log injection to forge
> log records and impede forensics or exploit potential vulnerabilities in
> log-processing systems. There is a variety of string classes that attackers
> can try to inject, including newlines, ANSI sequences, Unicode direction
> markers, Unicode homographs, JavaScript, PHP, etc.
>
> Ideally, applications defend against log injection attacks by encoding
> (aka escaping) user-supplied data before logging. The specific encoding
> depends on the desired level of protection. URL-encoding, for instance,
> would protect against all of the above-mentioned attack classes, but weaker
> encodings may be sometimes acceptable as well.
>
> A natural place to implement encoding is in the pattern layout
> configuration. Some encoding pattern converters are already available in
> log4j, but there are still gaps that I would like to help fill. I think
> there are roughly three of them:
>
> 1. The documentation should more prominently explain the issue. Today,
> most users would probably think that the following layout is HTML-safe,
> while it's not:
> 
>
> 2. The HTML encoder is not always sufficient. I would like to see an
> addition of a stricter one, such as a URL-encoder.
>
> 3. The current encoders encode all structured data (like the complete
> exception stacktrace) and not just the injection-prone parts (i.e., the
> exception message). This means I cannot replace the insecure layout above
> with the secure layout
>
> 
>
> without changing how logs are parsed (as the stack frames will not be
> separated by newlines anymore).
>
> I have created a PoC implementation of an improved encoder, but I would
> obviously need help to make it productive. Is anyone here interested in
> that? Questions and comments are welcome as well.
>
> Thanks,
> Vladimir
>
>
> -Original Message-
> From: Piotr P. Karwasz 
> Sent: Thursday, 5 October 2023 22:06
> To: dev@logging.apache.org; Klebanov, Vladimir 
> Subject: Re: [log4j] Improving log4j security
>
> [You don't often get email from piotr.karw...@gmail.com. Learn why this
> is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Vladimir,
>
> On Thu, 5 Oct 2023 at 21:47, Klebanov, Vladimir
>  wrote:
> > I would like to contribute some code in order to make log4j usage more
> secure. I have now sent two emails to the log4j security team but did not
> receive a response. Is anybody here interested? How can we discuss this
> further?
>
> Both times (10 Aug 2023, 23:19 and 29 Aug 2023, 20:49) we sent an
> answer to your address at sap.com.
>
> Anyway the general consensus was that the issue with generating HTML
> using PatternLayout does not constitute a security problem and you can
> discuss 

Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Volkan Yazıcı
I doubt a majority of the external plugins would work against `3.x`. I
raised this issue on July 25th
, though
I didn't get any reactions.

On Mon, Oct 9, 2023 at 8:57 PM Ralph Goers 
wrote:

> I think you are on the right track. We have to think of the main use case
> where a break in compatibility would cause a problem - an application uses
> libraries compiled with Log4j 2.x.
>
> I am much less concerned about custom plugins as presumably the user has
> some control over them. That said, I would hope the vast majority of
> plugins from third parties would continue to work.
>
> We cannot put users in a position where they cannot upgrade until all
> their dependencies do.
>
> Note that Spring Boot builds with Log4j 2.x. It needs access to
> PerformanceSensitive, PropertiesUtil, and PropertySource.
>
> Ralph
>
> > On Oct 9, 2023, at 1:17 AM, Piotr P. Karwasz 
> wrote:
> >
> > Hi all,
> >
> > We have often declared that 3.x will **not** constitute a major
> > version for Log4j API and that everything that used to work with 2.x
> > will work with 3.x (even provider code).
> >
> > However that statement does not apply in practice, since some breaking
> > changes **were** introduced e.g. in the `util` subpackage (cf. [1]
> > e.g.), which is marked as internal, but:
> > * in practice it is often used by plugin providers,
> > * has some classes (like MessageSupplier) that are actually used by
> > consumers of the API.
> >
> > That is why I would propose to revise the statement about compatibility:
> > * the main,`message` and `status` packages should be 100% compatible
> > with the previous version,
> > * the `spi` package should be as much compatible as possible,
> > * the `simple` package should be internal and we can do with it
> > anything we please,
> > * the `util` package should keep the types used by other packages and
> > all other classes should be moved to `util.internal`.
> >
> > I made an experiment[2] to see how many classes we need to keep in
> > `util`. It turns out we just need to keep:
> > * BiConsumer, IndexedReadOnlyStringMap, MessageSupplier,
> > MultiFormatStringBuilderFormattable, ReadOnlyStringMap,
> > StringBuilderFormattable, Supplier and TriConsumer to prevent breaking
> > changes in the main and `message` packages,
> > * StringMap to prevent changes in the `spi` package,
> > * we could keep `Constants` to prevent some `2.x` plugins from breaking.
> >
> > What do you think?
> >
> > Piotr
> >
> > [1] https://github.com/apache/logging-log4j2/pull/1586
> > [2] https://github.com/ppkarwasz/logging-log4j2/tree/clean-break
>
>


Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Piotr P. Karwasz
Hi Gary,

On Mon, 9 Oct 2023 at 13:04, Gary Gregory  wrote:
> I think it is OK to break compatibility in a major version. I can't believe
> we'd ship classes that are duplicates of Java classes like BiConsumer.

I would very much like to add breaking changes to Log4j API too, but
we can not reasonably do it. We would push the few users that use
Log4j API towards SLF4J for good.
We need a 100% binary compatibility with version 2.x of the API.

I tried a couple of low level tricks to support the old interfaces and
at the same time prevent users from employing them in new code, but
there is no ideal solution.

The trick that has the best chance to work is to introduce a v3.Logger
superinterface of Logger. LogManager#getLogger would still return
Logger for binary compatibility, but users might opt-in and cast their
Logger to v3.Logger.

Piotr


Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Piotr P. Karwasz
On Mon, 9 Oct 2023 at 20:57, Ralph Goers  wrote:
> We cannot put users in a position where they cannot upgrade until all their 
> dependencies do.

I agree, at work I still had a lot of libraries that depended on
Commons Lang 2, although

> Note that Spring Boot builds with Log4j 2.x. It needs access to 
> PerformanceSensitive, PropertiesUtil, and PropertySource.

I would really prefer not to export `PropertiesUtil`, but at least we
can migrate many other internal classes.

Piotr


Re: [log4j] Improving log4j security

2023-10-09 Thread Christian Grobmeier
Since Piotrs response went to spam (thanks for confirming) I'd like to make 
sure you reveived Volkans questions as well. Please let me know if you did. 

If you didn't, he sent his response to the mailing list, if you need help 
subscribing, please let me know 

On Mon, Oct 9, 2023, at 22:28, Volkan Yazıcı wrote:
> *[I am sharing my earlier response (almost) verbatim below.]*
>
> I would like to address your both old and the most recent email *myself* –
> that is, it only reflects my personal view, and not of the PMC.
>
>>  A HTML-safe layout is only achieved if
>
>> defined akin to:
>
>>
>
>>  `.
>
>
>> Would Log4j be willing to improve the usability of encoding in pattern
>> layouts to make it less likely for users to shoot themselves in the foot?
>
>
> We provide best in the class support for JSON, HTML, etc. with their
> associated dedicated layouts. If users insist on using `PatternLayout` for
> those purposes, it feels to me somebody is stubbornly trying to pass SQL
> arguments with string concatenation.
>
>
> Nevertheless, if you have any proposals on _"improving the usability of
> encoding in pattern layouts to make it less likely for users to shoot
> themselves in the foot"_, you are more than welcome! The entire Log4j crew
> would be happy to assist you for such contributions.
>
>
>> I did go ahead and create a proof-of-concept encoder for
>
>> log4j that securely encodes exceptions without completely
>
>> mangling the stack traces:
>
>>
>
>> https://github.com/vlkl-sap/log4j-encoder
>
>>
>
>> There are two different implementations of the encoder with
>
>> different trade-offs (to be discussed). I also implemented a
>
>> new, more encompassing text encoder, based on URL
>
>> encoding, but this aspect is independent.
>
>
> Before writing any code, would you mind helping us with the following
> questions, please?
>
>
>1. Do you have a use case? If so, where does `HtmlLayout` fall short of
>addressing it?
>2. Assuming `HtmlLayout` doesn't address your needs, can we [in a
>backward-compatible manner] improve `HtmlLayout` to make it work for you?
>3. Can we [in a backward-compatible manner] incorporate your
>`PatternLayout` changes?
>
> Kind regards.
>
>
> On Mon, Oct 9, 2023 at 5:24 PM Klebanov, Vladimir
>  wrote:
>
>> Thanks, Piotr. I don't know what happened to your replies (maybe the spam
>> filter dropped them), but I am happy that we recovered from that now.
>>
>> Log injections are definitely security issues, but if you prefer to talk
>> about them in the open, I will follow suit.
>>
>> For context: a log injection occurs when an application logs user-supplied
>> data (which is often the case). Attacker can exploit log injection to forge
>> log records and impede forensics or exploit potential vulnerabilities in
>> log-processing systems. There is a variety of string classes that attackers
>> can try to inject, including newlines, ANSI sequences, Unicode direction
>> markers, Unicode homographs, JavaScript, PHP, etc.
>>
>> Ideally, applications defend against log injection attacks by encoding
>> (aka escaping) user-supplied data before logging. The specific encoding
>> depends on the desired level of protection. URL-encoding, for instance,
>> would protect against all of the above-mentioned attack classes, but weaker
>> encodings may be sometimes acceptable as well.
>>
>> A natural place to implement encoding is in the pattern layout
>> configuration. Some encoding pattern converters are already available in
>> log4j, but there are still gaps that I would like to help fill. I think
>> there are roughly three of them:
>>
>> 1. The documentation should more prominently explain the issue. Today,
>> most users would probably think that the following layout is HTML-safe,
>> while it's not:
>> 
>>
>> 2. The HTML encoder is not always sufficient. I would like to see an
>> addition of a stricter one, such as a URL-encoder.
>>
>> 3. The current encoders encode all structured data (like the complete
>> exception stacktrace) and not just the injection-prone parts (i.e., the
>> exception message). This means I cannot replace the insecure layout above
>> with the secure layout
>>
>> 
>>
>> without changing how logs are parsed (as the stack frames will not be
>> separated by newlines anymore).
>>
>> I have created a PoC implementation of an improved encoder, but I would
>> obviously need help to make it productive. Is anyone here interested in
>> that? Questions and comments are welcome as well.
>>
>> Thanks,
>> Vladimir
>>
>>
>> -Original Message-
>> From: Piotr P. Karwasz 
>> Sent: Thursday, 5 October 2023 22:06
>> To: dev@logging.apache.org; Klebanov, Vladimir 
>> Subject: Re: [log4j] Improving log4j security
>>
>> [You don't often get email from piotr.karw...@gmail.com. Learn why this
>> is important at https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Hi Vladimir,
>>
>> On Thu, 5 Oct 2023 at 21:47, Klebanov, Vladimir
>>  wrote:
>> > I would like to contribut

Re: Breaking changes in Log4j API 3.x

2023-10-09 Thread Ralph Goers



> On Oct 9, 2023, at 2:14 PM, Piotr P. Karwasz  wrote:
> 
> On Mon, 9 Oct 2023 at 20:57, Ralph Goers  wrote:
>> We cannot put users in a position where they cannot upgrade until all their 
>> dependencies do.
> 
> I agree, at work I still had a lot of libraries that depended on
> Commons Lang 2, although
> 
>> Note that Spring Boot builds with Log4j 2.x. It needs access to 
>> PerformanceSensitive, PropertiesUtil, and PropertySource.
> 
> I would really prefer not to export `PropertiesUtil`, but at least we
> can migrate many other internal classes.

Unfortunately, the way PropertiesUtil was implemented makes it impossible to 
make private. Lots of things need access to properties.

Ralph



Re: [log4j] Improving log4j security

2023-10-09 Thread Ralph Goers
Also note that you can use lists.apache.org  to view 
the emails of any public ASF list to ensure you didn’t miss any.

Ralph

> On Oct 9, 2023, at 4:14 PM, Christian Grobmeier  wrote:
> 
> Since Piotrs response went to spam (thanks for confirming) I'd like to make 
> sure you reveived Volkans questions as well. Please let me know if you did. 
> 
> If you didn't, he sent his response to the mailing list, if you need help 
> subscribing, please let me know 
> 
> On Mon, Oct 9, 2023, at 22:28, Volkan Yazıcı wrote:
>> *[I am sharing my earlier response (almost) verbatim below.]*
>> 
>> I would like to address your both old and the most recent email *myself* –
>> that is, it only reflects my personal view, and not of the PMC.
>> 
>>> A HTML-safe layout is only achieved if
>> 
>>> defined akin to:
>> 
>>> 
>> 
>>> `.
>> 
>> 
>>> Would Log4j be willing to improve the usability of encoding in pattern
>>> layouts to make it less likely for users to shoot themselves in the foot?
>> 
>> 
>> We provide best in the class support for JSON, HTML, etc. with their
>> associated dedicated layouts. If users insist on using `PatternLayout` for
>> those purposes, it feels to me somebody is stubbornly trying to pass SQL
>> arguments with string concatenation.
>> 
>> 
>> Nevertheless, if you have any proposals on _"improving the usability of
>> encoding in pattern layouts to make it less likely for users to shoot
>> themselves in the foot"_, you are more than welcome! The entire Log4j crew
>> would be happy to assist you for such contributions.
>> 
>> 
>>> I did go ahead and create a proof-of-concept encoder for
>> 
>>> log4j that securely encodes exceptions without completely
>> 
>>> mangling the stack traces:
>> 
>>> 
>> 
>>> https://github.com/vlkl-sap/log4j-encoder
>> 
>>> 
>> 
>>> There are two different implementations of the encoder with
>> 
>>> different trade-offs (to be discussed). I also implemented a
>> 
>>> new, more encompassing text encoder, based on URL
>> 
>>> encoding, but this aspect is independent.
>> 
>> 
>> Before writing any code, would you mind helping us with the following
>> questions, please?
>> 
>> 
>>   1. Do you have a use case? If so, where does `HtmlLayout` fall short of
>>   addressing it?
>>   2. Assuming `HtmlLayout` doesn't address your needs, can we [in a
>>   backward-compatible manner] improve `HtmlLayout` to make it work for you?
>>   3. Can we [in a backward-compatible manner] incorporate your
>>   `PatternLayout` changes?
>> 
>> Kind regards.
>> 
>> 
>> On Mon, Oct 9, 2023 at 5:24 PM Klebanov, Vladimir
>>  wrote:
>> 
>>> Thanks, Piotr. I don't know what happened to your replies (maybe the spam
>>> filter dropped them), but I am happy that we recovered from that now.
>>> 
>>> Log injections are definitely security issues, but if you prefer to talk
>>> about them in the open, I will follow suit.
>>> 
>>> For context: a log injection occurs when an application logs user-supplied
>>> data (which is often the case). Attacker can exploit log injection to forge
>>> log records and impede forensics or exploit potential vulnerabilities in
>>> log-processing systems. There is a variety of string classes that attackers
>>> can try to inject, including newlines, ANSI sequences, Unicode direction
>>> markers, Unicode homographs, JavaScript, PHP, etc.
>>> 
>>> Ideally, applications defend against log injection attacks by encoding
>>> (aka escaping) user-supplied data before logging. The specific encoding
>>> depends on the desired level of protection. URL-encoding, for instance,
>>> would protect against all of the above-mentioned attack classes, but weaker
>>> encodings may be sometimes acceptable as well.
>>> 
>>> A natural place to implement encoding is in the pattern layout
>>> configuration. Some encoding pattern converters are already available in
>>> log4j, but there are still gaps that I would like to help fill. I think
>>> there are roughly three of them:
>>> 
>>> 1. The documentation should more prominently explain the issue. Today,
>>> most users would probably think that the following layout is HTML-safe,
>>> while it's not:
>>>
>>> 
>>> 2. The HTML encoder is not always sufficient. I would like to see an
>>> addition of a stricter one, such as a URL-encoder.
>>> 
>>> 3. The current encoders encode all structured data (like the complete
>>> exception stacktrace) and not just the injection-prone parts (i.e., the
>>> exception message). This means I cannot replace the insecure layout above
>>> with the secure layout
>>> 
>>>
>>> 
>>> without changing how logs are parsed (as the stack frames will not be
>>> separated by newlines anymore).
>>> 
>>> I have created a PoC implementation of an improved encoder, but I would
>>> obviously need help to make it productive. Is anyone here interested in
>>> that? Questions and comments are welcome as well.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> 
>>> -Original Message-
>>> From: Piotr P. Karwasz 
>>> Sent: Thurs