[PHP-BUG] Bug #61408 [NEW]: Character set failure

2012-03-15 Thread inge at upandforward dot com
From: 
Operating system: Linux Ubuntu
PHP version:  5.3.10
Package:  Strings related
Bug Type: Bug
Bug description:Character set failure

Description:

---
>From manual page:
http://www.php.net/function.basename#refsect1-function.basename-seealso
---
Even if I have defined setlocale('nb_NO.utf8");
the basename functions strips off some national characters.


Test script:
---
echo basename('/directory/Øving.png");

Expected result:

Øving.png

Actual result:
--
ving.png

-- 
Edit bug report at https://bugs.php.net/bug.php?id=61408&edit=1
-- 
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=61408&r=trysnapshot54
Try a snapshot (PHP 5.3):
https://bugs.php.net/fix.php?id=61408&r=trysnapshot53
Try a snapshot (trunk):  
https://bugs.php.net/fix.php?id=61408&r=trysnapshottrunk
Fixed in SVN:
https://bugs.php.net/fix.php?id=61408&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=61408&r=needdocs
Fixed in release:
https://bugs.php.net/fix.php?id=61408&r=alreadyfixed
Need backtrace:  
https://bugs.php.net/fix.php?id=61408&r=needtrace
Need Reproduce Script:   
https://bugs.php.net/fix.php?id=61408&r=needscript
Try newer version:   
https://bugs.php.net/fix.php?id=61408&r=oldversion
Not developer issue: 
https://bugs.php.net/fix.php?id=61408&r=support
Expected behavior:   
https://bugs.php.net/fix.php?id=61408&r=notwrong
Not enough info: 
https://bugs.php.net/fix.php?id=61408&r=notenoughinfo
Submitted twice: 
https://bugs.php.net/fix.php?id=61408&r=submittedtwice
register_globals:
https://bugs.php.net/fix.php?id=61408&r=globals
PHP 4 support discontinued:  
https://bugs.php.net/fix.php?id=61408&r=php4
Daylight Savings:https://bugs.php.net/fix.php?id=61408&r=dst
IIS Stability:   
https://bugs.php.net/fix.php?id=61408&r=isapi
Install GNU Sed: 
https://bugs.php.net/fix.php?id=61408&r=gnused
Floating point limitations:  
https://bugs.php.net/fix.php?id=61408&r=float
No Zend Extensions:  
https://bugs.php.net/fix.php?id=61408&r=nozend
MySQL Configuration Error:   
https://bugs.php.net/fix.php?id=61408&r=mysqlcfg



Bug #61408 [Fbk->Opn]: Character set failure

2012-03-15 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=61408&edit=1

 ID: 61408
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Character set failure
-Status: Feedback
+Status: Open
 Type:   Bug
 Package:Strings related
 Operating System:   Linux Ubuntu
-PHP Version:5.3.10
+PHP Version:5.3.5-1ubuntu7.7
 Block user comment: N
 Private report: N

 New Comment:

There is no return value (it may be false or an empty string), and locale -a 
does not include "nb_NO". However, the site is used in Norway for Norwegian 
users, so it must be able to handle Norwegian characters. The basename function 
should not make such distinctions, so I have written my own - which works 
independently of locale or character set.
The php version has been corrected above because the bug report did not give 
that choice. My suggested replacement could not be uploaded because the site 
did not accept a text file, even if it stated that it was the only filetype 
accepted. I also had great trouble getting the right CAPTCHA (Oh, yes, I can 
add numbers! :) )


Previous Comments:

[2012-03-15 21:28:20] cataphr...@php.net

What's the return value of setlocale? Does locale -a show that locale?


[2012-03-15 20:14:36] inge at upandforward dot com

Description:

---
>From manual page: 
>http://www.php.net/function.basename#refsect1-function.basename-seealso
---
Even if I have defined setlocale('nb_NO.utf8");
the basename functions strips off some national characters.


Test script:
---
echo basename('/directory/Øving.png");

Expected result:

Øving.png

Actual result:
--
ving.png






-- 
Edit this bug report at https://bugs.php.net/bug.php?id=61408&edit=1


Bug #61408 [Com]: Character set failure

2012-03-15 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=61408&edit=1

 ID: 61408
 Comment by: inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Character set failure
 Status: Open
 Type:   Bug
 Package:Strings related
 Operating System:   Linux Ubuntu
 PHP Version:5.3.5-1ubuntu7.7
 Block user comment: N
 Private report: N

 New Comment:

Here's a replacement, but without the $suffix parameter:

function IV_basename($name)
{   $name = strrchr($name,DIRECTORY_SEPARATOR);
if ($name) return substr($name,1);
return false;
}


Previous Comments:

[2012-03-15 22:55:47] inge at upandforward dot com

There is no return value (it may be false or an empty string), and locale -a 
does not include "nb_NO". However, the site is used in Norway for Norwegian 
users, so it must be able to handle Norwegian characters. The basename function 
should not make such distinctions, so I have written my own - which works 
independently of locale or character set.
The php version has been corrected above because the bug report did not give 
that choice. My suggested replacement could not be uploaded because the site 
did not accept a text file, even if it stated that it was the only filetype 
accepted. I also had great trouble getting the right CAPTCHA (Oh, yes, I can 
add numbers! :) )


[2012-03-15 21:28:20] cataphr...@php.net

What's the return value of setlocale? Does locale -a show that locale?

----
[2012-03-15 20:14:36] inge at upandforward dot com

Description:

---
>From manual page: 
>http://www.php.net/function.basename#refsect1-function.basename-seealso
---
Even if I have defined setlocale('nb_NO.utf8");
the basename functions strips off some national characters.


Test script:
---
echo basename('/directory/Øving.png");

Expected result:

Øving.png

Actual result:
--
ving.png






-- 
Edit this bug report at https://bugs.php.net/bug.php?id=61408&edit=1


Bug #61408 [Nab]: Character set failure

2012-03-16 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=61408&edit=1

 ID: 61408
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Character set failure
 Status: Not a bug
 Type:   Bug
 Package:Strings related
 Operating System:   Linux Ubuntu
 PHP Version:5.3.5-1ubuntu7.7
 Block user comment: N
 Private report: N

 New Comment:

Comment found on stackoverflow.com:

"If you ask me, setlocale() sucks ass. I am a friend of managing this manually 
or using a library like Zend_Localeso you don't have to depend on locales being 
installed on the server (and sometimes still not working)".

The client should NOT be expected to use the same locale as the server, but if 
I could install the Norwegian locale, I think it might still be OK, So how do I 
install the above mentioned library, or some equivalent?


Previous Comments:

[2012-03-16 01:35:25] ahar...@php.net

There are encodings where 0x2f (/) and 0x5c (\) are valid bytes within 
multi-byte 
characters, so basename() has to be locale-aware to deal with that.

Fundamentally, I don't really see why you would expect basename() to be able to 
deal sensibly with a configured locale that you don't have installed. Your test 
script behaves fine provided nb_NO.utf8 is available.

Not a bug -> closing.

--------
[2012-03-15 23:01:59] inge at upandforward dot com

Here's a replacement, but without the $suffix parameter:

function IV_basename($name)
{   $name = strrchr($name,DIRECTORY_SEPARATOR);
if ($name) return substr($name,1);
return false;
}

--------
[2012-03-15 22:55:47] inge at upandforward dot com

There is no return value (it may be false or an empty string), and locale -a 
does not include "nb_NO". However, the site is used in Norway for Norwegian 
users, so it must be able to handle Norwegian characters. The basename function 
should not make such distinctions, so I have written my own - which works 
independently of locale or character set.
The php version has been corrected above because the bug report did not give 
that choice. My suggested replacement could not be uploaded because the site 
did not accept a text file, even if it stated that it was the only filetype 
accepted. I also had great trouble getting the right CAPTCHA (Oh, yes, I can 
add numbers! :) )


[2012-03-15 21:28:20] cataphr...@php.net

What's the return value of setlocale? Does locale -a show that locale?

----------------
[2012-03-15 20:14:36] inge at upandforward dot com

Description:

---
>From manual page: 
>http://www.php.net/function.basename#refsect1-function.basename-seealso
---
Even if I have defined setlocale('nb_NO.utf8");
the basename functions strips off some national characters.


Test script:
---
echo basename('/directory/Øving.png");

Expected result:

Øving.png

Actual result:
--
ving.png






-- 
Edit this bug report at https://bugs.php.net/bug.php?id=61408&edit=1


Bug #61408 [Nab]: Character set failure

2012-03-18 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=61408&edit=1

 ID: 61408
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Character set failure
 Status: Not a bug
 Type:   Bug
 Package:Strings related
 Operating System:   Linux Ubuntu
 PHP Version:5.3.5-1ubuntu7.7
 Block user comment: N
 Private report: N

 New Comment:

Already installed! :)
Anyway, I thought that utf-8 was utf-8, regardless of locale. Oh well. I'll do 
with my work-around.


Previous Comments:

[2012-03-19 00:23:27] ahar...@php.net

That's really a question for an Ubuntu support channel, but I think installing 
the 
language-support-nb package would do it.


[2012-03-16 10:07:04] inge at upandforward dot com

Comment found on stackoverflow.com:

"If you ask me, setlocale() sucks ass. I am a friend of managing this manually 
or using a library like Zend_Localeso you don't have to depend on locales being 
installed on the server (and sometimes still not working)".

The client should NOT be expected to use the same locale as the server, but if 
I could install the Norwegian locale, I think it might still be OK, So how do I 
install the above mentioned library, or some equivalent?


[2012-03-16 01:35:25] ahar...@php.net

There are encodings where 0x2f (/) and 0x5c (\) are valid bytes within 
multi-byte 
characters, so basename() has to be locale-aware to deal with that.

Fundamentally, I don't really see why you would expect basename() to be able to 
deal sensibly with a configured locale that you don't have installed. Your test 
script behaves fine provided nb_NO.utf8 is available.

Not a bug -> closing.

--------
[2012-03-15 23:01:59] inge at upandforward dot com

Here's a replacement, but without the $suffix parameter:

function IV_basename($name)
{   $name = strrchr($name,DIRECTORY_SEPARATOR);
if ($name) return substr($name,1);
return false;
}

--------
[2012-03-15 22:55:47] inge at upandforward dot com

There is no return value (it may be false or an empty string), and locale -a 
does not include "nb_NO". However, the site is used in Norway for Norwegian 
users, so it must be able to handle Norwegian characters. The basename function 
should not make such distinctions, so I have written my own - which works 
independently of locale or character set.
The php version has been corrected above because the bug report did not give 
that choice. My suggested replacement could not be uploaded because the site 
did not accept a text file, even if it stated that it was the only filetype 
accepted. I also had great trouble getting the right CAPTCHA (Oh, yes, I can 
add numbers! :) )




The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

https://bugs.php.net/bug.php?id=61408


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=61408&edit=1


[PHP-BUG] Req #55774 [NEW]: Array index limitation

2011-09-24 Thread inge at upandforward dot com
From: 
Operating system: php 5.3.5-1ubuntu7.2
PHP version:  5.3.8
Package:  *General Issues
Bug Type: Feature/Change Request
Bug description:Array index limitation

Description:

Array indexes, although they can be strings, are not allowed to contain
national characters, even if encoded as UTF-8.
Thus, a string like "øvinger" becomes "vinger", and "Æsop" becomes
"sop".
This is very unfortunate.

My example builds an array of "name only" indexes, all in lower case.
Each entry of the array contains the complete filename of the corresponding
file.
This serves as a fast and relatively safe method to find the correct path
to a file, regardless of case.

I tested using, among others, a file named "php/Øvinger.php".
DS is Directory Separator (/),
$filetypes is an array containing file types to search for (like "php")
lowercase and name_only should be self-explanatory.

Test script:
---
//  First time only: Find all php files in
//  the 'php', 'inc' and 'txt' directories.

if (!isset ($_SESSION['long']))
{   $_SESSION['long'] = array();
foreach ($filetypes as $dir)
{   $files = glob ($dir.DS."*.$dir");
// Save all info for each file. This means that we won't have to
foreach ($files as $file)   // search any more.
{   $name = lowercase(name_only ($file));
$_SESSION['long'][$name] = $file;
}
}
}


Expected result:

$_SESSION['long']['øvinger'] contains "php/Øvinger"




Actual result:
--
$_SESSION['long']['vinger'] contains "php/Øvinger"


-- 
Edit bug report at https://bugs.php.net/bug.php?id=55774&edit=1
-- 
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=55774&r=trysnapshot54
Try a snapshot (PHP 5.3):
https://bugs.php.net/fix.php?id=55774&r=trysnapshot53
Try a snapshot (trunk):  
https://bugs.php.net/fix.php?id=55774&r=trysnapshottrunk
Fixed in SVN:
https://bugs.php.net/fix.php?id=55774&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=55774&r=needdocs
Fixed in release:
https://bugs.php.net/fix.php?id=55774&r=alreadyfixed
Need backtrace:  
https://bugs.php.net/fix.php?id=55774&r=needtrace
Need Reproduce Script:   
https://bugs.php.net/fix.php?id=55774&r=needscript
Try newer version:   
https://bugs.php.net/fix.php?id=55774&r=oldversion
Not developer issue: 
https://bugs.php.net/fix.php?id=55774&r=support
Expected behavior:   
https://bugs.php.net/fix.php?id=55774&r=notwrong
Not enough info: 
https://bugs.php.net/fix.php?id=55774&r=notenoughinfo
Submitted twice: 
https://bugs.php.net/fix.php?id=55774&r=submittedtwice
register_globals:
https://bugs.php.net/fix.php?id=55774&r=globals
PHP 4 support discontinued:  
https://bugs.php.net/fix.php?id=55774&r=php4
Daylight Savings:https://bugs.php.net/fix.php?id=55774&r=dst
IIS Stability:   
https://bugs.php.net/fix.php?id=55774&r=isapi
Install GNU Sed: 
https://bugs.php.net/fix.php?id=55774&r=gnused
Floating point limitations:  
https://bugs.php.net/fix.php?id=55774&r=float
No Zend Extensions:  
https://bugs.php.net/fix.php?id=55774&r=nozend
MySQL Configuration Error:   
https://bugs.php.net/fix.php?id=55774&r=mysqlcfg



Bug #55774 [Fbk->Opn]: Array index limitation

2011-09-25 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=55774&edit=1

 ID: 55774
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Array index limitation
-Status: Feedback
+Status: Open
 Type:   Bug
 Package:Scripting Engine problem
 Operating System:   php 5.3.5-1ubuntu7.2
 PHP Version:5.3.8
 Block user comment: N
 Private report: N

 New Comment:

You are right. This was a case of jumping to conclusions.
The value used as a key was already wrong, and I should have checked that.
The problem is associated with the function basename, but I have also not been 
able to reproduce this in a standalone script, so it must be a side-effect from 
something else. I need to investigate further.

Sorry to have bothered you. I really thought I had done enough testing! :)


Previous Comments:

[2011-09-24 18:58:15] ahar...@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with ,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.

I can't reproduce this at all in a standalone script:
http://codepad.viper-7.com/pZswtF shows an example of a UTF-8 encoded
array key being set properly. (I wrote another test that persists a
similar array key across multiple pages via $_SESSION, and that worked
as expected too.)


[2011-09-24 18:41:08] inge at upandforward dot com

Description:

Array indexes, although they can be strings, are not allowed to contain 
national characters, even if encoded as UTF-8.
Thus, a string like "øvinger" becomes "vinger", and "Æsop" becomes "sop".
This is very unfortunate.

My example builds an array of "name only" indexes, all in lower case.
Each entry of the array contains the complete filename of the corresponding 
file.
This serves as a fast and relatively safe method to find the correct path to a 
file, regardless of case.

I tested using, among others, a file named "php/Øvinger.php".
DS is Directory Separator (/),
$filetypes is an array containing file types to search for (like "php")
lowercase and name_only should be self-explanatory.

Test script:
---
//  First time only: Find all php files in
//  the 'php', 'inc' and 'txt' directories.

if (!isset ($_SESSION['long']))
{   $_SESSION['long'] = array();
foreach ($filetypes as $dir)
{   $files = glob ($dir.DS."*.$dir");
// Save all info for each file. This means that we won't have to
foreach ($files as $file)   // search any more.
{   $name = lowercase(name_only ($file));
$_SESSION['long'][$name] = $file;
}
}
}


Expected result:

$_SESSION['long']['øvinger'] contains "php/Øvinger"




Actual result:
--
$_SESSION['long']['vinger'] contains "php/Øvinger"







-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55774&edit=1


Bug #55774 [Opn]: Array index limitation

2011-09-25 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=55774&edit=1

 ID: 55774
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Array index limitation
 Status: Open
 Type:   Bug
 Package:Scripting Engine problem
 Operating System:   php 5.3.5-1ubuntu7.2
 PHP Version:5.3.8
 Block user comment: N
 Private report: N

 New Comment:

I think I have found the problem, but no solution. The following test script 
functions perfectly when run stand-alone, but "basename" fails when run from 
Apache 2.0.
Are you able to reproduce the error?




Previous Comments:

[2011-09-25 07:41:06] inge at upandforward dot com

You are right. This was a case of jumping to conclusions.
The value used as a key was already wrong, and I should have checked that.
The problem is associated with the function basename, but I have also not been 
able to reproduce this in a standalone script, so it must be a side-effect from 
something else. I need to investigate further.

Sorry to have bothered you. I really thought I had done enough testing! :)


[2011-09-24 18:58:15] ahar...@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with ,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.

I can't reproduce this at all in a standalone script:
http://codepad.viper-7.com/pZswtF shows an example of a UTF-8 encoded
array key being set properly. (I wrote another test that persists a
similar array key across multiple pages via $_SESSION, and that worked
as expected too.)


[2011-09-24 18:41:08] inge at upandforward dot com

Description:

Array indexes, although they can be strings, are not allowed to contain 
national characters, even if encoded as UTF-8.
Thus, a string like "øvinger" becomes "vinger", and "Æsop" becomes "sop".
This is very unfortunate.

My example builds an array of "name only" indexes, all in lower case.
Each entry of the array contains the complete filename of the corresponding 
file.
This serves as a fast and relatively safe method to find the correct path to a 
file, regardless of case.

I tested using, among others, a file named "php/Øvinger.php".
DS is Directory Separator (/),
$filetypes is an array containing file types to search for (like "php")
lowercase and name_only should be self-explanatory.

Test script:
---
//  First time only: Find all php files in
//  the 'php', 'inc' and 'txt' directories.

if (!isset ($_SESSION['long']))
{   $_SESSION['long'] = array();
foreach ($filetypes as $dir)
{   $files = glob ($dir.DS."*.$dir");
// Save all info for each file. This means that we won't have to
foreach ($files as $file)   // search any more.
{   $name = lowercase(name_only ($file));
$_SESSION['long'][$name] = $file;
}
}
}


Expected result:

$_SESSION['long']['øvinger'] contains "php/Øvinger"




Actual result:
--
$_SESSION['long']['vinger'] contains "php/Øvinger"







-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55774&edit=1


Bug #55774 [Opn]: Array index limitation

2011-09-25 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=55774&edit=1

 ID: 55774
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Array index limitation
 Status: Open
 Type:   Bug
 Package:Scripting Engine problem
 Operating System:   php 5.3.5-1ubuntu7.2
 PHP Version:5.3.8
 Block user comment: N
 Private report: N

 New Comment:

For now I have made a work-around which works in all cases, using this function:

function name_only($file)
{   $from[] = '.'.substr(strrchr($file,'.'),1);
$from[] = dirname($file).'/';   // Remove extension and path.
return str_replace ($from,'',$file);
}


Previous Comments:
--------
[2011-09-25 08:37:08] inge at upandforward dot com

I think I have found the problem, but no solution. The following test script 
functions perfectly when run stand-alone, but "basename" fails when run from 
Apache 2.0.
Are you able to reproduce the error?



----------------
[2011-09-25 07:41:06] inge at upandforward dot com

You are right. This was a case of jumping to conclusions.
The value used as a key was already wrong, and I should have checked that.
The problem is associated with the function basename, but I have also not been 
able to reproduce this in a standalone script, so it must be a side-effect from 
something else. I need to investigate further.

Sorry to have bothered you. I really thought I had done enough testing! :)


[2011-09-24 18:58:15] ahar...@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with ,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.

I can't reproduce this at all in a standalone script:
http://codepad.viper-7.com/pZswtF shows an example of a UTF-8 encoded
array key being set properly. (I wrote another test that persists a
similar array key across multiple pages via $_SESSION, and that worked
as expected too.)

------------
[2011-09-24 18:41:08] inge at upandforward dot com

Description:

Array indexes, although they can be strings, are not allowed to contain 
national characters, even if encoded as UTF-8.
Thus, a string like "øvinger" becomes "vinger", and "Æsop" becomes "sop".
This is very unfortunate.

My example builds an array of "name only" indexes, all in lower case.
Each entry of the array contains the complete filename of the corresponding 
file.
This serves as a fast and relatively safe method to find the correct path to a 
file, regardless of case.

I tested using, among others, a file named "php/Øvinger.php".
DS is Directory Separator (/),
$filetypes is an array containing file types to search for (like "php")
lowercase and name_only should be self-explanatory.

Test script:
---
//  First time only: Find all php files in
//  the 'php', 'inc' and 'txt' directories.

if (!isset ($_SESSION['long']))
{   $_SESSION['long'] = array();
foreach ($filetypes as $dir)
{   $files = glob ($dir.DS."*.$dir");
// Save all info for each file. This means that we won't have to
foreach ($files as $file)   // search any more.
{   $name = lowercase(name_only ($file));
$_SESSION['long'][$name] = $file;
}
}
}


Expected result:

$_SESSION['long']['øvinger'] contains "php/Øvinger"




Actual result:
--
$_SESSION['long']['vinger'] contains "php/Øvinger"







-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55774&edit=1


Bug #55774 [Opn]: Array index limitation

2011-09-25 Thread inge at upandforward dot com
Edit report at https://bugs.php.net/bug.php?id=55774&edit=1

 ID: 55774
 User updated by:inge at upandforward dot com
 Reported by:inge at upandforward dot com
 Summary:Array index limitation
 Status: Open
 Type:   Bug
 Package:Scripting Engine problem
 Operating System:   php 5.3.5-1ubuntu7.2
 PHP Version:5.3.8
 Block user comment: N
 Private report: N

 New Comment:

The original code, using "basename", only fails on my local server, NOT when 
executed on my webhost.

That should conclude that this is not a PHP bug?


Previous Comments:

[2011-09-25 09:00:50] inge at upandforward dot com

For now I have made a work-around which works in all cases, using this function:

function name_only($file)
{   $from[] = '.'.substr(strrchr($file,'.'),1);
$from[] = dirname($file).'/';   // Remove extension and path.
return str_replace ($from,'',$file);
}

----------------
[2011-09-25 08:37:08] inge at upandforward dot com

I think I have found the problem, but no solution. The following test script 
functions perfectly when run stand-alone, but "basename" fails when run from 
Apache 2.0.
Are you able to reproduce the error?



----------------
[2011-09-25 07:41:06] inge at upandforward dot com

You are right. This was a case of jumping to conclusions.
The value used as a key was already wrong, and I should have checked that.
The problem is associated with the function basename, but I have also not been 
able to reproduce this in a standalone script, so it must be a side-effect from 
something else. I need to investigate further.

Sorry to have bothered you. I really thought I had done enough testing! :)


[2011-09-24 18:58:15] ahar...@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with ,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.

I can't reproduce this at all in a standalone script:
http://codepad.viper-7.com/pZswtF shows an example of a UTF-8 encoded
array key being set properly. (I wrote another test that persists a
similar array key across multiple pages via $_SESSION, and that worked
as expected too.)

----------------
[2011-09-24 18:41:08] inge at upandforward dot com

Description:

Array indexes, although they can be strings, are not allowed to contain 
national characters, even if encoded as UTF-8.
Thus, a string like "øvinger" becomes "vinger", and "Æsop" becomes "sop".
This is very unfortunate.

My example builds an array of "name only" indexes, all in lower case.
Each entry of the array contains the complete filename of the corresponding 
file.
This serves as a fast and relatively safe method to find the correct path to a 
file, regardless of case.

I tested using, among others, a file named "php/Øvinger.php".
DS is Directory Separator (/),
$filetypes is an array containing file types to search for (like "php")
lowercase and name_only should be self-explanatory.

Test script:
---
//  First time only: Find all php files in
//  the 'php', 'inc' and 'txt' directories.

if (!isset ($_SESSION['long']))
{   $_SESSION['long'] = array();
foreach ($filetypes as $dir)
{   $files = glob ($dir.DS."*.$dir");
// Save all info for each file. This means that we won't have to
foreach ($files as $file)   // search any more.
{   $name = lowercase(name_only ($file));
$_SESSION['long'][$name] = $file;
}
}
}


Expected result:

$_SESSION['long']['øvinger'] contains "php/Øvinger"




Actual result:
--
$_SESSION['long']['vinger'] contains "php/Øvinger"







-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55774&edit=1