Edit report at http://bugs.php.net/bug.php?id=54688&edit=1
ID: 54688 Updated by: ras...@php.net Reported by: g dot huebgen at arcor dot de Summary: case insensitive search of stripos does not work when searching äöü in utf-8 Status: Bogus Type: Bug Package: Strings related Operating System: Linux PHP Version: 5.3.6 Block user comment: N Private report: N New Comment: That means your string is not actually in UTF-8. utf8_decode() converts text in ISO-8859-1 to UTF-8. You stated initially that you had text encoded in UTF-8. Previous Comments: ------------------------------------------------------------------------ [2011-05-08 20:17:42] g dot huebgen at arcor dot de Hi rasmus. Now I tried mb_stripos but the result is not different to stripos. The same program but using mb_stripos: $text = file_get_contents("test-utf8.txt"); $str = "über"; if (($pos=mb_stripos($text,$str)) !== false) echo $str." found"; else echo $str." not found"; output is: not found! If I use utf8_decode for both $text and $str then stripos will work properly. ------------------------------------------------------------------------ [2011-05-08 17:43:47] ras...@php.net This is not a bug. The base string handling functions in PHP do not support multibyte character sets. Since UTF-8 is compatible with single-byte charsets at the low end, it may appear to work for UTF-8, but it will break as soon as you hit an actual mb character. You can use mb_stripos() in this case, or you can use the function overloading support in mbstring to make your stripos mb aware. See http://de.php.net/manual/en/mbstring.overload.php ------------------------------------------------------------------------ [2011-05-08 17:00:44] g dot huebgen at arcor dot de Description: ------------ --- >From manual page: http://www.php.net/function.stripos#Description --- If some text is encoded in UTF-8 and I search this text with stripos for a string with (e.g.) lower case Umlaut (e.g. ü), this function does not find the upper-case Umlaut (Ã). That means case-insensitive does not work for Umlauts if a text file is encoded UTF-8. Test script: --------------- File test.txt contains "Ãbermut" and is encoded UTF-8 without BOM <?php $text = file_get_contents("test.txt"); echo $text."<br>"; $str = "über"; if (($pos=stripos($text,$str)) !== false) echo $str." gefunden"; else echo $str." nicht gefunden"; ?> Expected result: ---------------- Ãbermut über gefunden Actual result: -------------- Ãbermut über nicht gefunden ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=54688&edit=1