Edit report at http://bugs.php.net/bug.php?id=54028&edit=1

 ID:                 54028
 Updated by:         paj...@php.net
 Reported by:        schmale at froglogic dot com
 Summary:            Directory::read() cannot handle non-unicode chars
                     properly
-Status:             Open
+Status:             Bogus
 Type:               Bug
 Package:            Directory function related
 Operating System:   Windows 7
 PHP Version:        5.3.5
 Block user comment: N
 Private report:     N

 New Comment:

There is already a feature request for unicode filesystem support.



Btw, Windows does not use UTF-8 for its encoding.


Previous Comments:
------------------------------------------------------------------------
[2011-02-15 16:51:20] schmale at froglogic dot com

Description:
------------
Notice: This problem does ONLY affect the CLI interpreter, NOT the CGI.



Using dir('path/to/dir'), the read() method does not return UTF-8, if
the directory contains e.g. umlauts (ä, ö, ü). I tested this on Linux
and Windows, both CGI and CLI, and the problem does only occur with
Windows/CLI.

Test script:
---------------
$path = 'path/to/directory/which/contains/umlauts';



$directory = dir($path);

while (false !== ($content = $directory->read())) {

    if (mb_check_encoding($content, 'UTF-8') === false) {

        fprintf(STDERR, 'Returned non-utf-8 (%s)', $content);

    }

}



Expected result:
----------------
The expected result, of course, was that the return value of read is
always encoded in UTF-8, i.e. no messages are print, when we run the
script.

Actual result:
--------------
If a subdirectory contains umlauts (or I guess any non-unicode
character), a message is print, i.e. the return value is not encoded in
UTF-8.


------------------------------------------------------------------------



-- 
Edit this bug report at http://bugs.php.net/bug.php?id=54028&edit=1

Reply via email to