[PHP-BUG] Req #60429 [NEW]: utf8_encode and utf8_decode functions should be deprecated

2011-12-01 Thread deceze at gmail dot com
From: 
Operating system: 
PHP version:  Irrelevant
Package:  *XML functions
Bug Type: Feature/Change Request
Bug description:utf8_encode and utf8_decode functions should be deprecated

Description:

The purpose of the functions utf8_encode and utf8_decode are time and again

misunderstood and have probably caused more encoding related problems than
they 
have solved. The biggest reason for this is their naming. Their purpose is
to 
*convert* the encoding of a string from ISO-8859-1 to UTF-8, yet they are
named 
in a way that suggests some other magical function that is necessary to
work 
with UTF-8 text. Users looking for "UTF-8 support" in their app quickly
find 
these functions due to their naming and use them without understanding what
they 
do, often only testing with ASCII text which appears to work fine of first

sight.

Why is ISO-8859-1 presumed to be the default encoding when converting to
UTF-8, 
hence why do these functions occupy such a prominent spot in the namespace?

There's simply no good reason for it.

The same functionality is available through iconv and mb_convert_encoding.

Therefore I suggest to slowly deprecate utf8_encode and utf8_decode to
clear up 
a recurring confusion and consolidate features into the existing, much more

versatile iconv and mb_ functions.


-- 
Edit bug report at https://bugs.php.net/bug.php?id=60429&edit=1
-- 
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=60429&r=trysnapshot54
Try a snapshot (PHP 5.3):
https://bugs.php.net/fix.php?id=60429&r=trysnapshot53
Try a snapshot (trunk):  
https://bugs.php.net/fix.php?id=60429&r=trysnapshottrunk
Fixed in SVN:
https://bugs.php.net/fix.php?id=60429&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=60429&r=needdocs
Fixed in release:
https://bugs.php.net/fix.php?id=60429&r=alreadyfixed
Need backtrace:  
https://bugs.php.net/fix.php?id=60429&r=needtrace
Need Reproduce Script:   
https://bugs.php.net/fix.php?id=60429&r=needscript
Try newer version:   
https://bugs.php.net/fix.php?id=60429&r=oldversion
Not developer issue: 
https://bugs.php.net/fix.php?id=60429&r=support
Expected behavior:   
https://bugs.php.net/fix.php?id=60429&r=notwrong
Not enough info: 
https://bugs.php.net/fix.php?id=60429&r=notenoughinfo
Submitted twice: 
https://bugs.php.net/fix.php?id=60429&r=submittedtwice
register_globals:
https://bugs.php.net/fix.php?id=60429&r=globals
PHP 4 support discontinued:  
https://bugs.php.net/fix.php?id=60429&r=php4
Daylight Savings:https://bugs.php.net/fix.php?id=60429&r=dst
IIS Stability:   
https://bugs.php.net/fix.php?id=60429&r=isapi
Install GNU Sed: 
https://bugs.php.net/fix.php?id=60429&r=gnused
Floating point limitations:  
https://bugs.php.net/fix.php?id=60429&r=float
No Zend Extensions:  
https://bugs.php.net/fix.php?id=60429&r=nozend
MySQL Configuration Error:   
https://bugs.php.net/fix.php?id=60429&r=mysqlcfg



Bug #47990 [Com]: mb_check_encoding() accepts surrogates for UTF-8

2012-01-18 Thread deceze at gmail dot com
Edit report at https://bugs.php.net/bug.php?id=47990&edit=1

 ID: 47990
 Comment by: deceze at gmail dot com
 Reported by:mercator+bugs at gmail dot com
 Summary:mb_check_encoding() accepts surrogates for UTF-8
 Status: Assigned
 Type:   Bug
 Package:mbstring related
 Operating System:   Windows XP
 PHP Version:5.2.9
 Assigned To:moriyoshi
 Block user comment: N
 Private report: N

 New Comment:

This seems to be fixed in PHP 5.3, it returns false as expected. Close?


Previous Comments:

[2009-04-16 15:53:35] mercator+bugs at gmail dot com

Description:

mb_check_encoding() wrongly considers surrogates (Unicode range U+D800 - 
U+DFFF) to be valid for the UTF-8 encoding.

Reproduce code:
---
var_dump(mb_check_encoding("\xed\xa0\x80",'UTF-8'));

Expected result:

bool(false)

Actual result:
--
bool(true)






-- 
Edit this bug report at https://bugs.php.net/bug.php?id=47990&edit=1