Edit report at http://bugs.php.net/bug.php?id=54688&edit=1
ID: 54688 User updated by: g dot huebgen at arcor dot de Reported by: g dot huebgen at arcor dot de Summary: case insensitive search of stripos does not work when searching äöü in utf-8 Status: Bogus Type: Bug Package: Strings related Operating System: Linux PHP Version: 5.3.6 Block user comment: N Private report: N New Comment: You are right. Your mb_stripos works fine. My mistake in this was that I forgot the parameter "UTF-8"! Now everything is clear. Thank you Gerhard Previous Comments: ------------------------------------------------------------------------ [2011-05-09 06:44:51] ras...@php.net Well, somewhere along the way you have messed up your encoding since it works fine when both strings are UTF-8: var_dump(mb_stripos("Ãbermut","über",0,"UTF-8")); Are you saying that this doesn't give you int(0) on your platform? ------------------------------------------------------------------------ [2011-05-09 06:33:40] g dot huebgen at arcor dot de The description of utf8_decode states clearly that this function decodes UTF8 text. The manual says: "utf8_decode â Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1" So my text is indeed in UTF-8 and my remark on utf8_decode only confirms what rasmus (comment #1) said. ------------------------------------------------------------------------ [2011-05-08 20:25:23] ras...@php.net That means your string is not actually in UTF-8. utf8_decode() converts text in ISO-8859-1 to UTF-8. You stated initially that you had text encoded in UTF-8. ------------------------------------------------------------------------ [2011-05-08 20:17:42] g dot huebgen at arcor dot de Hi rasmus. Now I tried mb_stripos but the result is not different to stripos. The same program but using mb_stripos: $text = file_get_contents("test-utf8.txt"); $str = "über"; if (($pos=mb_stripos($text,$str)) !== false) echo $str." found"; else echo $str." not found"; output is: not found! If I use utf8_decode for both $text and $str then stripos will work properly. ------------------------------------------------------------------------ [2011-05-08 17:43:47] ras...@php.net This is not a bug. The base string handling functions in PHP do not support multibyte character sets. Since UTF-8 is compatible with single-byte charsets at the low end, it may appear to work for UTF-8, but it will break as soon as you hit an actual mb character. You can use mb_stripos() in this case, or you can use the function overloading support in mbstring to make your stripos mb aware. See http://de.php.net/manual/en/mbstring.overload.php ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=54688 -- Edit this bug report at http://bugs.php.net/bug.php?id=54688&edit=1