Edit report at https://bugs.php.net/bug.php?id=61001&edit=1
ID: 61001 Comment by: anon at anon dot anon Reported by: mike at eastghost dot com Summary: Corruption of "=0a" but not "=a0" Status: Open Type: Bug Package: PCRE related Operating System: Ubuntu LAMP PHP Version: 5.3.10 Block user comment: N Private report: N New Comment: Not a bug and it has nothing to do with UTF8. The error message says why it's not working: the eval'd code has a syntax error, because you forgot to wrap the argument to debdcode_post in quotes. It should be: $html_oursr[0] = 'debdcode_post(\'$1\')'; It works for `debdcode_post(a0)` because a0 is parsed as a constant (if you do `error_reporting(-1);` you will see the notice about the use of the undefined constant), but `debdcode_post(0a)` is always a syntax error. But the better (faster) solution is to use preg_replace_callback. Previous Comments: ------------------------------------------------------------------------ [2012-02-07 10:10:23] mike at eastghost dot com Description: ------------ Passing following UTF8 text thru 3rd line of the test script (i.e., preg_replace() function) causes an error in preg_replace function: [post=0a /] Whereas, passing following UTF8 text similarly causes no error: [post=a0 /] Problem seems to be caused only when the "=" is followed by an integer then followed by a letter. I briefly tried other combinations without causing error. Workaround is to replace third line of test script with this line (i.e., use the preg_replace_callback() instead of preg_replace() $out = preg_replace_callback( '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uiu', 'debdcode_post', $i_html ); Test script: --------------- $html_ours[0] = '@\[p(?:ost){0,1}=(.{1,24})\ {0,}\/\]@Uieu'; $html_oursr[0] = 'debdcode_post( $1 )'; // irrelevant, use any misc func that looks up post id in db $out = preg_replace( $html_ours, $html_oursr, $i_html ); Expected result: ---------------- The general use is in a BBCODE-like parser for use in a FORUMS app. What should happen: In the source text (in UTF-8 format), the string "[post=4ablahblah /]" should be picked out of any given arbitrary input by the preg_replace() and then translated to a hyperlink by the debdcode_post(). What is happening instead is the error in preg_replace, presumably from malformed UTF-8 or possibly a bug inside preg_replace when dealing with the particular character sequence "=<integer><letter(s) and/or integer(s)>. Note that it's the "=" followed by an integer and then followed by at least one letter and/or more integers that triggers the error. I hope this helps; thank you for looking. Actual result: -------------- Parse error: syntax error, unexpected T_STRING in /apath/Class/common_functions.inc(1405) : regexp code on line 1 Fatal error: preg_replace() [<a href='function.preg-replace'>function.preg- replace</a>]: Failed evaluating code: debdcode_post( 4f30abfddc79595474000020 ) in <file:line> ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=61001&edit=1