From:             astatutov at gmail dot com
Operating system: 
PHP version:      Irrelevant
Package:          Strings related
Bug Type:         Bug
Bug description:String access by character is not multibyte-safe

Description:
------------
I know, there is section named "Details of the String Type" in
documentation. But still there is other section, that stats "Think of a
string as an array of characters for this purpose". This is very convenient
to think so. We use mbstring extension to work entirely on utf-8 and
mbstring.func_overload option allows us almost forget about differences
between regular and multibyte strings. We just write our application,
thinking about its native logic, not PHP internal logic. This is high-level
programming language, by the way. We're using strlen, substr, etc. as we're
doing with regular strings. And BANG! String bracket operator returns
bytes, not characters! 

I think it's unpredictable behavior, even if it's well-documented (but it's
not). Considering that the use of utf-8 grows everywhere and maybe even PHP
6 will support it by default, why not implement multibyte support in
bracket operations now in mbstring extension? Of course, it must be
configurable to be back-compatible. I know, we can use substr as a replace
of string accessing operation, but it's very slow and it's wrong in
general.

Also I now this is not a first bug on this subject. There was #51919 as
example, which was closed and marked as not a bug. But I propose to look at
this problem from the point of view of the language logic, not the
implementation.

Sorry, if I've missed something else. 

Test script:
---------------
$str = "Kąt";
echo $str[1];

Expected result:
----------------
ą

Actual result:
--------------
�

-- 
Edit bug report at https://bugs.php.net/bug.php?id=63079&edit=1
-- 
Try a snapshot (PHP 5.4):            
https://bugs.php.net/fix.php?id=63079&r=trysnapshot54
Try a snapshot (PHP 5.3):            
https://bugs.php.net/fix.php?id=63079&r=trysnapshot53
Try a snapshot (trunk):              
https://bugs.php.net/fix.php?id=63079&r=trysnapshottrunk
Fixed in SVN:                        
https://bugs.php.net/fix.php?id=63079&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=63079&r=needdocs
Fixed in release:                    
https://bugs.php.net/fix.php?id=63079&r=alreadyfixed
Need backtrace:                      
https://bugs.php.net/fix.php?id=63079&r=needtrace
Need Reproduce Script:               
https://bugs.php.net/fix.php?id=63079&r=needscript
Try newer version:                   
https://bugs.php.net/fix.php?id=63079&r=oldversion
Not developer issue:                 
https://bugs.php.net/fix.php?id=63079&r=support
Expected behavior:                   
https://bugs.php.net/fix.php?id=63079&r=notwrong
Not enough info:                     
https://bugs.php.net/fix.php?id=63079&r=notenoughinfo
Submitted twice:                     
https://bugs.php.net/fix.php?id=63079&r=submittedtwice
register_globals:                    
https://bugs.php.net/fix.php?id=63079&r=globals
PHP 4 support discontinued:          
https://bugs.php.net/fix.php?id=63079&r=php4
Daylight Savings:                    https://bugs.php.net/fix.php?id=63079&r=dst
IIS Stability:                       
https://bugs.php.net/fix.php?id=63079&r=isapi
Install GNU Sed:                     
https://bugs.php.net/fix.php?id=63079&r=gnused
Floating point limitations:          
https://bugs.php.net/fix.php?id=63079&r=float
No Zend Extensions:                  
https://bugs.php.net/fix.php?id=63079&r=nozend
MySQL Configuration Error:           
https://bugs.php.net/fix.php?id=63079&r=mysqlcfg

Reply via email to