Edit report at https://bugs.php.net/bug.php?id=63450&edit=1

 ID:                 63450
 Updated by:         ezy...@php.net
 Reported by:        trollofdarkness at gmail dot com
 Summary:            iconv returns false when illegal character
                     encountered
 Status:             Re-Opened
 Type:               Bug
 Package:            ICONV related
 Operating System:   Debian 5 Lenny
 PHP Version:        5.4.8
 Block user comment: N
 Private report:     N

 New Comment:

This is a dupe of https://bugs.php.net/bug.php?id=48147 (not that I don't think 
it should be fixed!) Here is the glibc bug: 
http://sourceware.org/bugzilla/show_bug.cgi?id=13541


Previous Comments:
------------------------------------------------------------------------
[2012-11-08 02:24:27] ahar...@php.net

Reopening per above. Anyone more familiar with iconv and the build system want 
to opine?

------------------------------------------------------------------------
[2012-11-07 21:58:24] trollofdarkness at gmail dot com

Hi,

So, I had a look at it and this is not a libiconv related bug. It is a glibc 
related bug (so, iconv, but the glibc implementation) as I was not using the 
GNU 
libiconv implementation but the glibc one.

Actually, I had the 2.7 version of glibc. I tested on another machine - a 
Ubuntu 
12.04 LTS server - where the glibc version was 2.14 and, indeed, the bug was 
not 
present. So it is in recent versions of glibc.

To correct the problem on Debian, you can recompile PHP to use the libiconv 
implementation instead of the glibc one.

But it is NOT quite easy because PHP looks for glic implementation BEFORE 
libiconv and select it if present... even with every --with-iconv=something 
parameter you can use when running ./configure.

I used the solution presented there : 
<http://stackoverflow.com/questions/4743080/how-can-i-force-php-to-use-the-
libiconv-version-of-iconv-instead-of-the-centos-i/4851065#4851065> and as one 
of 
the comments states, I had to change global configure file and not (only) the 
one of ext/iconv. (note that, first, you have to actually download libiconv and 
compile it... but that's just wget && ./configure && make && make install).

I now have the libiconv implementation in use and it's working perfectly.

I storngly think PHP should change the behaviour of the configure file, we 
should not have to edit it to use the libiconv implementation, we should just 
be 
able to use the right configure parameter!

------------------------------------------------------------------------
[2012-11-06 22:19:13] trollofdarkness at gmail dot com

Hi Rasmus,

Thanks for your help!

I will have a look at that on the spot and will post an update to say if it 
works 
to downgrade the libiconv.

------------------------------------------------------------------------
[2012-11-06 21:54:00] ras...@php.net

This is not a PHP issue. This is a change in recent versions of libiconv. If 
you 
link PHP against an older version of libiconv it will work again or you can use 
mbstring_convert_encoding(). And we have a new uconverter extension feature 
coming that will do a better job than either of these. See 
https://wiki.php.net/rfc/uconverter

------------------------------------------------------------------------
[2012-11-06 21:45:33] trollofdarkness at gmail dot com

Description:
------------
Hi everyone,

I have been, since I think the version 5.3.x is out (and still with 5.4.8), 
experiencing issues with iconv.


Especially, when an illegal character is encountered and the //IGNORE flag is 
set on the target charset, the function returns FALSE instead of just skipping 
this character.

This is problematic because if a single character in a 50 000 chars long string 
is "illegal" then the output is nothing, just for one char... 

It does not happen with the TRANSLIT flag.

I experienced that with UTF8 (from) and ISO-8859-15 (to) charsets, I did not 
test with other ones. Below is an example to reproduce the bug.

Note : I saw there are other bug reports about similar issues, but they're all 
saying the string is cut... In my case, it literally returns false. So, might 
be 
different? 

Test script:
---------------
<?php

$str = "
foo
è
foo
";
$result = iconv("UTF-8", "ISO-8859-15"."//IGNORE", $str);

var_dump($result); // false, instead of "foo ... foo"

?>

Expected result:
----------------
foo

foo


Actual result:
--------------
false


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=63450&edit=1

Reply via email to