> Reasons for supporting
> - semantically NUL is not whitespaces
> - the majority of other popular languages don't trim NUL
> Reasons for not supporting
> - Java do trim NUL
> - Security issues in existing code base
> - Already has mb_trim() and the second parameter instead to prevent trimming 
> NUL if people want
> - Unnecessary changes in the life-cycle

I think it would be useful if there were some examples of when you
would want to be using `trim` but _not_ trim NULL bytes. The examples
in the RFC currently show the expected change in behaviour; which is
good - but you could also achieve the same effect by not running
`trim` in the first place, as the only character in the examples that
is expected to be removed before or after the change is the NULL byte
(even in the example with a new line followed by null bytes, after the
change then the string would be identical to before the `trim`).

Given that most voters seem to be not strongly against, but also
seeing no benefit in changing the status quo, some examples of how the
change being used would be useful might help.

~ Robert

On Wed, Mar 25, 2026 at 6:15 AM LamentXU <[email protected]> wrote:
>
> I think there are sound opinions in both side so I will still let the vote 
> begin and see what the majority thinks. To be short,
>
> Reasons for supporting
> - semantically NUL is not whitespaces
> - the majority of other popular languages don't trim NUL
> Reasons for not supporting
> - Java do trim NUL
> - Security issues in existing code base
> - Already has mb_trim() and the second parameter instead to prevent trimming 
> NUL if people want
> - Unnecessary changes in the life-cycle
>
> This is a quite minor change (and thats why people don't talk about this 
> before, since little people run into the case of trimming NUL).
>
> Well my opinion is, first I think trimming is indeed for white-spaces. I know 
> Java do trim NULs, but it doesn't explicitly do that, it removes every char 
> with ascii <= 20 (and I think most people are using strip() instead, which 
> doesn't remove NUL), besides almost every other standard or language don't 
> trim NUL. So in the case of aligning with popular standards or languages it 
> make sense to avoid trimming NUL.
>
> Security and life-cycle concerns are good points. Un-trimming NUL may cause a 
> sort of path hacking as Ilia mentioned, while php trimming \0 is already 
> well-known among some php devs who ran into this case before.
>
> We has a second character to alter the trimmed char set, but I do think most 
> people would expect it not to be trimmed by default aligning with other 
> languages.
>
> At 2026-03-25 03:15:25, "Ilia" <[email protected]> wrote:
>
> That seems a bit dangerous, since non-stripped \0 can allow it to potentially 
> lead to issues because when concatinated with other strings, which is quite 
> common for string operations can result in un-predictability and possibly 
> even security issues.
>
> You make a good point about other languages, the concern is while there that 
> is the expecation and different solutions exist for sanitizing/handling \0 
> they are well known and understood, in PHP the assumption is that \0 is 
> removed and the change of this assumption breaks a lot of things.
>
> Just my 2c.
>
> On Sun, Mar 15, 2026 at 2:23 AM LamentXU <[email protected]> wrote:
>>
>> Dear all,
>>
>> I am sending this to introduce my new RFC: 
>> https://wiki.php.net/RFC/dont_trim_NUL
>>
>> Quick summary:
>>
>> Currently, PHP's trim functions strip the NUL byte (\0) by default, treating 
>> it alongside spaces, tabs, and newlines. This creates a highly surprising 
>> edge case.
>>
>> Because \0 is semantically a control character or a vital part of a binary 
>> payload rather than a typographical whitespace character, casually using 
>> trim() to clean up trailing newlines can silently corrupt binary streams or 
>> cryptographic hashes by stripping legitimate NUL bytes. Whitespace 
>> characters are intended for typographical spacing and formatting (e.g., 
>> spaces, newlines, tabs).
>>
>> Also, almost every mainstream programming languages except PHP doesn't trim 
>> NUL characters (python, go, rust, js, even 'is_space' function in glibc...) 
>> It sounds reasonable to expect the same here.
>>
>> This RFC proposes removing \0 (ASCII 0) from the default character mask. I 
>> recognize this introduces a backward compatibility break, and therefore I 
>> would love to hear your thoughts, feedback, and any concerns regarding the 
>> BC impact before moving forward.
>>
>> Cheers,
>> Weilin Du
>
>
>
> --
> Ilia Alshanetsky
> Technologist, CTO, Entrepreneur
> E: [email protected]
> T: @iliaa
> B: http://ilia.ws

Reply via email to