On 23-3-2026 1:34, youkidearitai wrote:
Hi, Internals
I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
https://wiki.php.net/rfc/eol-oniguruma
https://github.com/php/php-src/pull/21490
Thank you for writing this RFC. I don't have a strong opinion either
way. I fully understand that maintaining the Oniguruma library, while it
was abandoned by the original project is a huge and unenviable task.
Having said that, I am very curious what Ruby will be using going
forward and if PHP could adopt a similar solution.
I also wonder if there are no other "blessed" forks of the Oniguruma
library to which PHP could switch.
I believe this should be investigated and the results of this
investigation should be added to the RFC to (potentially) strengthen the
case for the current proposal, or, depending on the findings, it could
be that the current proposal could be adjusted based on what this
investigation throws up.
Secondly, I believe the RFC would benefit from a more detailed section
about what PHP devs can do to mitigate the deprecation.
For example, if the only expected text encoding is UTF-8, people can use
`preg_*()` functions with the `u` modifier instead of the `mb_ereg*()`
functions.
I also think it is important to mention that the Symfony Mbstring[1]
polyfill package does **NOT** polyfill the MB regex functionality, so
cannot be used as a replacement/alternative.
With this in mind, I also believe the impact analysis in the RFC should
be expanded as the MbString extension is widely used.
To support this, I've created a branch in the PHPCompatibility package
[2] specifically for this deprecation and I have run the relevant checks
over the Packagist Top 4000 (as of yesterday).
I've posted the used ruleset and the full results as a gist.
https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214
Summary of findings:
PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
-------------------------------------------------------------------------------------------
SOURCE COUNT
-------------------------------------------------------------------------------------------
PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated 30
PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated 25
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated 20
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated 18
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated 13
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated
10
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated
9
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated
6
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated
5
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated 4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated
4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated 2
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated 1
-------------------------------------------------------------------------------------------
A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
-------------------------------------------------------------------------------------------
So, 147 occurances in the Packagist top 4000 in total.
While this is lower than I would have expected, it should be remembered
that most distributed packages will default to/require UTF-8 encoding
and that code handling non-UTF8 encodings - and therefore needing the Mb
regex functionality - is mostly found in proprietary packages.
The PIE extension would help those packages.
Another potential alternative for those packages would be to convert all
their data and code to a UTF-8 base, which will be a humongous project
for most (and that deserves a mention in the RFC).
Hope this helps.
Smile,
Juliette
1: https://symfony.com/packages/polyfill-mbstring
2:
https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f82d13dcfe496549c1110d250e18a8c
3: https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214