Edit report at http://bugs.php.net/bug.php?id=54688&edit=1

 ID:                 54688
 User updated by:    g dot huebgen at arcor dot de
 Reported by:        g dot huebgen at arcor dot de
 Summary:            case insensitive search of stripos does not work
                     when searching äöü in utf-8
 Status:             Bogus
 Type:               Bug
 Package:            Strings related
 Operating System:   Linux
 PHP Version:        5.3.6
 Block user comment: N
 Private report:     N

 New Comment:

The description of utf8_decode states clearly that this function decodes
UTF8 text. The manual says:

"utf8_decode — Converts a string with ISO-8859-1 characters encoded
with UTF-8 to single-byte ISO-8859-1"



So my text is indeed in UTF-8 and my remark on utf8_decode only confirms
what rasmus (comment #1) said.


Previous Comments:
------------------------------------------------------------------------
[2011-05-08 20:25:23] ras...@php.net

That means your string is not actually in UTF-8. utf8_decode() converts
text in 

ISO-8859-1 to UTF-8. You stated initially that you had text encoded in
UTF-8.

------------------------------------------------------------------------
[2011-05-08 20:17:42] g dot huebgen at arcor dot de

Hi rasmus.

Now I tried mb_stripos but the result is not different to stripos.

The same program but using mb_stripos:

$text = file_get_contents("test-utf8.txt");

$str = "über";

if (($pos=mb_stripos($text,$str)) !== false)

        echo $str." found";

else echo $str." not found";



output is: not found!



If I use utf8_decode for both $text and $str then stripos will work
properly.

------------------------------------------------------------------------
[2011-05-08 17:43:47] ras...@php.net

This is not a bug. The base string handling functions in PHP do not
support 

multibyte character sets. Since UTF-8 is compatible with single-byte
charsets at 

the low end, it may appear to work for UTF-8, but it will break as soon
as you hit 

an actual mb character. You can use mb_stripos() in this case, or you
can use the 

function overloading support in mbstring to make your stripos mb aware.




See http://de.php.net/manual/en/mbstring.overload.php

------------------------------------------------------------------------
[2011-05-08 17:00:44] g dot huebgen at arcor dot de

Description:
------------
---

>From manual page: http://www.php.net/function.stripos#Description

---

If some text is encoded in UTF-8 and I search this text with stripos for
a string with (e.g.) lower case Umlaut (e.g. ü), this function does not
find the upper-case Umlaut (Ü). That means case-insensitive does not
work for Umlauts if a text file is encoded UTF-8.

Test script:
---------------
File test.txt contains "Übermut" and is encoded UTF-8 without BOM

<?php

$text = file_get_contents("test.txt");

echo $text."<br>";

$str = "über";

if (($pos=stripos($text,$str)) !== false)

        echo $str." gefunden";

else echo $str." nicht gefunden";

?>

Expected result:
----------------
Übermut

über gefunden 

Actual result:
--------------
Übermut

über nicht gefunden 


------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=54688&edit=1

Reply via email to