Edit report at http://bugs.php.net/bug.php?id=54028&edit=1
ID: 54028 Updated by: paj...@php.net Reported by: schmale at froglogic dot com Summary: Directory::read() cannot handle non-unicode chars properly Status: Bogus Type: Bug Package: Directory function related Operating System: Windows 7 PHP Version: 5.3.5 Block user comment: N Private report: N New Comment: Sorry, but I don't have any more ways to explain why it could work for one case or another. There is no bug but a feature request for Unicode support. Previous Comments: ------------------------------------------------------------------------ [2011-02-25 14:16:37] carsten_sttgt at gmx dot de > PHP relies on the ANSI APIs and the encoding is then the runtime encoding > (whatever is set for the running process or system wild). "Startmenü" can be accessed without any problems thought the ANSI API. An "ü" exists in CP437, CP850 and CP1252 (just use the chcp command), thus I'm not talking about unicode. You can also test this with a small C-Code. So why is | php -r "echo realpath('.');" returning false? ------------------------------------------------------------------------ [2011-02-25 13:56:58] paj...@php.net I'm not sure what else I should say to explain what is possible and what not. Last attempt: Unless you 100% know which runtime encoding is actually used by the process where PHP runs, you are are out of luck and have to use ASCII (if you have luck, maybe ANSI too). But anything related to Unicode does not work, period. Even if one can have the feeling that it works from time to time due to the joy of similar encoding, or close enough. ------------------------------------------------------------------------ [2011-02-25 13:52:29] carsten_sttgt at gmx dot de > Windows supports UCS-2 internally via the wild char APIs. I now... I'm just wondering why: "mb_detect_encoding($content)" is returing 'UTF-8' and "mb_check_encoding($content, 'UTF-8')" is returning FALSE? Also I think there is another problem: | C:\Users\Carsten Wiedmann>php -r "echo realpath('.');" | C:\Users\Carsten Wiedmann | C:\Users\Carsten Wiedmann>cd Startmenü | | C:\Users\Carsten Wiedmann\Startmenü>php -r "echo realpath('.');" | | C:\Users\Carsten Wiedmann\Startmenü> Regards, Carsten ------------------------------------------------------------------------ [2011-02-25 13:32:49] paj...@php.net There is no UTF-8 support in Windows APIs or in PHP for the file system APIs. Windows supports UCS-2 internally via the wild char APIs. PHP relies on the ANSI APIs and the encoding is then the runtime encoding (whatever is set for the running process or system wild). The feature request I was referring to is about making PHP uses the wild char API and accepts UTF-8 as input (and output). ------------------------------------------------------------------------ [2011-02-25 13:29:15] carsten_sttgt at gmx dot de | and the problem does only occur with Windows/CLI. I have no difference between CGI and CLI (both executed from the shell) Of course, something is courious: <?php $directory = dir(getenv('USERPROFILE')); while (false !== ($content = $directory->read())) { if (mb_check_encoding($content, 'UTF-8') === false) { printf('Returned non-utf-8 (%s)', $content); printf(" Encoding: %s\r\n", mb_detect_encoding($content)); } } ?> And the output is: Returned non-utf-8 (Startmenü) Encoding: UTF-8 Regards, Carsten ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=54028 -- Edit this bug report at http://bugs.php.net/bug.php?id=54028&edit=1