From: astatutov at gmail dot com Operating system: PHP version: Irrelevant Package: Strings related Bug Type: Bug Bug description:String access by character is not multibyte-safe
Description: ------------ I know, there is section named "Details of the String Type" in documentation. But still there is other section, that stats "Think of a string as an array of characters for this purpose". This is very convenient to think so. We use mbstring extension to work entirely on utf-8 and mbstring.func_overload option allows us almost forget about differences between regular and multibyte strings. We just write our application, thinking about its native logic, not PHP internal logic. This is high-level programming language, by the way. We're using strlen, substr, etc. as we're doing with regular strings. And BANG! String bracket operator returns bytes, not characters! I think it's unpredictable behavior, even if it's well-documented (but it's not). Considering that the use of utf-8 grows everywhere and maybe even PHP 6 will support it by default, why not implement multibyte support in bracket operations now in mbstring extension? Of course, it must be configurable to be back-compatible. I know, we can use substr as a replace of string accessing operation, but it's very slow and it's wrong in general. Also I now this is not a first bug on this subject. There was #51919 as example, which was closed and marked as not a bug. But I propose to look at this problem from the point of view of the language logic, not the implementation. Sorry, if I've missed something else. Test script: --------------- $str = "KÄ t"; echo $str[1]; Expected result: ---------------- Ä Actual result: -------------- � -- Edit bug report at https://bugs.php.net/bug.php?id=63079&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=63079&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=63079&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=63079&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=63079&r=fixed Fixed in SVN and need be documented: https://bugs.php.net/fix.php?id=63079&r=needdocs Fixed in release: https://bugs.php.net/fix.php?id=63079&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=63079&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=63079&r=needscript Try newer version: https://bugs.php.net/fix.php?id=63079&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=63079&r=support Expected behavior: https://bugs.php.net/fix.php?id=63079&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=63079&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=63079&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=63079&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=63079&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=63079&r=dst IIS Stability: https://bugs.php.net/fix.php?id=63079&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=63079&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=63079&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=63079&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=63079&r=mysqlcfg