2026年3月24日(火) 20:46 Juliette Reinders Folmer
<[email protected]>:
>
> On 23-3-2026 1:34, youkidearitai wrote:
>
> Hi, Internals
>
> I decide deprecate mbregex in 8.6 and drop in 9.0.
> So I would like to go to Under Discussion phase.
> https://wiki.php.net/rfc/eol-oniguruma
> https://github.com/php/php-src/pull/21490
>
>
> Thank you for writing this RFC. I don't have a strong opinion either way. I 
> fully understand that maintaining the Oniguruma library, while it was 
> abandoned by the original project is a huge and unenviable task.
>
> Having said that, I am very curious what Ruby will be using going forward and 
> if PHP could adopt a similar solution.
> I also wonder if there are no other "blessed" forks of the Oniguruma library 
> to which PHP could switch.
> I believe this should be investigated and the results of this investigation 
> should be added to the RFC to (potentially) strengthen the case for the 
> current proposal, or, depending on the findings, it could be that the current 
> proposal could be adjusted based on what this investigation throws up.
>
> Secondly, I believe the RFC would benefit from a more detailed section about 
> what PHP devs can do to mitigate the deprecation.
> For example, if the only expected text encoding is UTF-8, people can use 
> `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` 
> functions.
>
> I also think it is important to mention that the Symfony Mbstring[1] polyfill 
> package does **NOT** polyfill the MB regex functionality, so cannot be used 
> as a replacement/alternative.
>
> With this in mind, I also believe the impact analysis in the RFC should be 
> expanded as the MbString extension is widely used.
>
> To support this, I've created a branch in the PHPCompatibility package [2] 
> specifically for this deprecation and I have run the relevant checks over the 
> Packagist Top 4000 (as of yesterday).
>
> I've posted the used ruleset and the full results as a gist.
> https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214
>
> Summary of findings:
>
> PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
> -------------------------------------------------------------------------------------------
> SOURCE                                                                        
>         COUNT
> -------------------------------------------------------------------------------------------
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated              
>         30
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated     
>         25
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated      
>         20
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated       
>         18
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated         
>         13
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated   
>         10
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated   
>         9
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated
>       6
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated
>         5
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated               
>         4
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated 
>         4
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated              
>         2
> PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated    
>         1
> -------------------------------------------------------------------------------------------
> A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
> -------------------------------------------------------------------------------------------
>
> So, 147 occurances in the Packagist top 4000 in total.
>
> While this is lower than I would have expected, it should be remembered that 
> most distributed packages will default to/require UTF-8 encoding and that 
> code handling non-UTF8 encodings - and therefore needing the Mb regex 
> functionality - is mostly found in proprietary packages.
>
> The PIE extension would help those packages.
>
> Another potential alternative for those packages would be to convert all 
> their data and code to a UTF-8 base, which will be a humongous project for 
> most (and that deserves a mention in the RFC).
>
> Hope this helps.
>
> Smile,
> Juliette
>
>
> 1: https://symfony.com/packages/polyfill-mbstring
> 2: 
> https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f82d13dcfe496549c1110d250e18a8c
> 3: https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214

Hi, Juliette

Thank you very much for your gist.
I saw your gist, seems like depends mbregex(Oniguruma).

> Having said that, I am very curious what Ruby will be using going forward and 
> if PHP could adopt a similar solution.
> I also wonder if there are no other "blessed" forks of the Oniguruma library 
> to which PHP could switch.
> I believe this should be investigated and the results of this investigation 
> should be added to the RFC to (potentially) strengthen the case for the 
> current proposal, or, depending on the findings, it could be that the current 
> proposal could be adjusted based on what this investigation throws up.

Indeed, There is a Onigmo in
Ruby(https://github.com/ruby/ruby/blob/master/regexec.c) that fork
from Oniguruma.
There are Onigmo and Oniguruma differences.

I wrote your feedback to RFC.
And I quoted your gist result. Please let me know if there is any problem.
Thank you again.

Regards
Yuya

-- 
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- https://github.com/youkidearitai
-----------------------------

Reply via email to