ID: 42528 Updated by: sjo...@php.net Reported By: mahesh dot vemula at in dot ibm dot com -Status: Open +Status: Bogus -Bug Type: Performance problem +Bug Type: *Unicode Issues Operating System: RHEL4 PHP Version: 6CVS-2007-09-03 (CVS) New Comment:
Thank you for your bug report. In unicode, characters and bytes do not map one-on-one. That means that the codepoint for a character may be bigger than 256. The ord() function returns the numerical value for the codepoint, which may be bigger than 256. It should not wrap around as you describe. Previous Comments: ------------------------------------------------------------------------ [2007-09-07 12:30:28] mahesh dot vemula at in dot ibm dot com Samething is happening for strncasecmp() function also. ------------------------------------------------------------------------ [2007-09-03 11:53:31] mahesh dot vemula at in dot ibm dot com Description: ------------ The character(8-bit) range of value does not roll back to the corresponding minimum value, once the value reaches beyond the range of a char(8-bit) that it can hold, when Unicode is ON. Reproduce code: --------------- --TEST-- --FILE-- <?php var_dump( decbin( ord(chr(255)) ) ); var_dump( decbin( ord(chr(256)) ) ); var_dump( decbin( ord(chr(257)) ) ); ?> --EXPECTF-- Expected result: ---------------- unicode(8) "11111111" unicode(1) "0" unicode(1) "1" Actual result: -------------- unicode(8) "11111111" unicode(9) "100000000" unicode(9) "100000001" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=42528&edit=1