ID: 29609 Updated by: [EMAIL PROTECTED] Reported By: jon at hiveminds dot net Status: Bogus Bug Type: Strings related Operating System: Windows 2000 / SP4 PHP Version: 5.0.0 New Comment:
Of course if you pass it as an iso-8859-2 character... In -2 it's at a different position than in -5 See: http://www.eki.ee/letter/chardata.cgi?cp=8859-2&cp1=8859-15 (It's position A9 in -2 and A6 in -15) Previous Comments: ------------------------------------------------------------------------ [2004-08-11 14:16:52] jon at hiveminds dot net This is a Unicode character as well as being present in ISO-8859-15 (aka Latin-9 -- see http://www.columbia.edu/kermit/latin9.html), and the PHP Manual says that UTF-8 and ISO-8859-15 are both supposed to be supported by this function. However, even when UTF-8 or ISO-8859-15 is specified, this function still fails to make the expected conversion, and createEntityReference() still returns multiple warnings when passed the corresponding entity name even though it shouldn't. ------------------------------------------------------------------------ [2004-08-11 13:26:50] [EMAIL PROTECTED] This charset is not supported, as you can see here: http://php.net/htmlentities ------------------------------------------------------------------------ [2004-08-11 11:40:58] jon at hiveminds dot net Description: ------------ This string function fails on the "š" character (and other Latin-2 characters). Test: echo htmlentities("Ketšua"); Returns: Ketšua Reproduce code: --------------- Affected code: $value = htmlentities($value); preg_match_all("/&([^;]*);/", $value, $matches); $parts = preg_split("/&|;/", $value, -1, PREG_SPLIT_NO_EMPTY); foreach($parts as $part) $td->appendChild( in_array($part, $matches[1]) ? $doc->createEntityReference($part) : $doc->createTextNode($part) ); Point of failure: the string "Ketšua" Expected result: ---------------- Character should be converted to Š (upper) / š (lower) per HTML spec (see special-1.ent listing). Failing this, should it not be possible to create a TextNode containing this character when specifying a Latin-2 or Unicode charset? Actual result: -------------- Function returns the literal "š" character with no conversion. It is not possible to create a TextNode containing this character using DOMDocument::createTextNode(). I have written a workaround which replaces the Lat-2 chars with their entity equivalents using str_replace(), but even so PHP still issues several warnings when sending output to the browser using DOMDocument::saveHTML(): Warning: output conversion failed due to conv error in O:\webs\mysqli\oop-multi-select-with-dom.php on line 172 Warning: Bytes: 0x92 0x49 0x76 0x6F in O:\webs\mysqli\oop-multi-select-with-dom.php on line 172 Warning: xmlOutputBufferWrite: encoder error in O:\webs\mysqli\oop-multi-select-with-dom.php on line 172 The correct entity (š) does get sent, but I have to suppress the error with an "@", which I don't like doing. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=29609&edit=1