Edit report at http://bugs.php.net/bug.php?id=54028&edit=1

 ID:                 54028
 Updated by:         paj...@php.net
 Reported by:        schmale at froglogic dot com
 Summary:            Directory::read() cannot handle non-unicode chars
                     properly
 Status:             Bogus
 Type:               Bug
 Package:            Directory function related
 Operating System:   Windows 7
 PHP Version:        5.3.5
 Block user comment: N
 Private report:     N

 New Comment:

I'm not sure what else I should say to explain what is possible and what
not. 



Last attempt: Unless you 100% know which runtime encoding is actually
used by 

the process where PHP runs, you are are out of luck and have to use
ASCII (if 

you have luck, maybe ANSI too).



But anything related to Unicode does not work, period. Even if one can
have the 

feeling that it works from time to time due to the joy of similar
encoding, or 

close enough.


Previous Comments:
------------------------------------------------------------------------
[2011-02-25 13:52:29] carsten_sttgt at gmx dot de

> Windows supports UCS-2 internally via the wild char APIs.

I now... I'm just wondering why:



"mb_detect_encoding($content)" is returing 'UTF-8'

and

"mb_check_encoding($content, 'UTF-8')" is returning FALSE?





Also I think there is another problem:

| C:\Users\Carsten Wiedmann>php -r "echo realpath('.');"

| C:\Users\Carsten Wiedmann

| C:\Users\Carsten Wiedmann>cd Startmenü

| 

| C:\Users\Carsten Wiedmann\Startmenü>php -r "echo realpath('.');"

| 

| C:\Users\Carsten Wiedmann\Startmenü>



Regards,

Carsten

------------------------------------------------------------------------
[2011-02-25 13:32:49] paj...@php.net

There is no UTF-8 support in Windows APIs or in PHP for the file system
APIs.



Windows supports UCS-2 internally via the wild char APIs. PHP relies on
the ANSI 

APIs and the encoding is then the runtime encoding (whatever is set for
the 

running process or system wild).



The feature request I was referring to is about making PHP uses the wild
char API 

and accepts UTF-8 as input (and output).

------------------------------------------------------------------------
[2011-02-25 13:29:15] carsten_sttgt at gmx dot de

| and the problem does only occur with Windows/CLI.



I have no difference between CGI and CLI (both executed from the shell)



Of course, something is courious:

<?php

$directory = dir(getenv('USERPROFILE'));

while (false !== ($content = $directory->read())) {

    if (mb_check_encoding($content, 'UTF-8') === false) {

        printf('Returned non-utf-8 (%s)', $content);

        printf(" Encoding: %s\r\n", mb_detect_encoding($content));

    }

}

?>



And the output is:

Returned non-utf-8 (Startmenü) Encoding: UTF-8





Regards,

Carsten

------------------------------------------------------------------------
[2011-02-15 17:10:43] schmale at froglogic dot com

Well, I don't know what Windows uses as encoding, but I sure do know,
that it works properly with the Windows CGI version. The point is, a
directory called 'Startmenü' will return 'Startmenü' with Linux/CGI,
Linux/CLI, Windows/CGI, but NOT with Windows/CLI - the latter returning
'Startmenñæ' (or sth similar). In other words: The behaviour with
Windows/CLI is broken, where the other versions return the exact name of
the directory, as expected.



So I think it has nothing (little) to do with unicode filesystem support
or the encoding of Windows, but with differences between CGI and CLI.

------------------------------------------------------------------------
[2011-02-15 16:54:17] paj...@php.net

There is already a feature request for unicode filesystem support.



Btw, Windows does not use UTF-8 for its encoding.

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    http://bugs.php.net/bug.php?id=54028


-- 
Edit this bug report at http://bugs.php.net/bug.php?id=54028&edit=1

Reply via email to