ID: 30887 Updated by: [EMAIL PROTECTED] Reported By: maddam at volny dot cz -Status: Open +Status: Bogus Bug Type: *XML functions Operating System: Win XP PHP Version: 5.0.2 New Comment:
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php This is expected, and definitely not wrong. There is never said that XML parsers can't break up CDATA sections, and you should never rely on getting only one event for each CDATA section. This is just how an XML parser might handle it (and the new libxml2 we have in PHP 5 does it like this). Previous Comments: ------------------------------------------------------------------------ [2004-11-24 21:08:48] maddam at volny dot cz Description: ------------ xml file element: <data>Jak se máš holoubátko ?</data> This is Czech language with special characters. <?php function characterData($parser, $data) $getdata = $data; echo $getdata must show 'Jak se máš holoubátko ?' But the parser stop and $getdata will consist of 'Jak se m' and at the next step on same element parser will get all last text and $getdata will consist of 'áš holoubátko ?' This is bug for 5.0.2. In 4.3.9 and sooner is all OK. 5.0.0 and 5.0.1 i was not tested. Description: The parser when get data with language characters as (ìšèøžýáíé) will cut this data to two parts. First part consist of characters to first occurence of some character (ìšèøžýáíé) and second part consist of spare element. THIS BUG WILL NOT SHOW FOR ENGLISH LANGUAGE WHICH NOT USE CHARACTERS AS ìšèøžýáíé Sorry for my english i hope you understand. Contact me at [EMAIL PROTECTED] or ICQ 25684007 Reproduce code: --------------- <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE rokam [ <!ELEMENT data (#PCDATA)> ]> <rokam> <data>Jak se máš</data> <data>Zde doma je dobrý ocet</data> </rokam> <?php function characterData($parser, $data) $getdata = $data; echo $getdata . <br />; Expected result: ---------------- Echo on screen, need two steps through function characterData: Jak se máš Zde doma je dobrý ocet Actual result: -------------- This output of parser 5.0.2 need four steps through function characterData and will output: Jak se m áš Zde doma je dobr ý ocet ------------------------------------------------- This BUG can be repaired with this code, who connect two parts from parser to one variable say $data. This code connect 'Jak se m' with 'áš' function characterData($parser, $data) { global $currentTag; // <code for repair start> global $lastdata, $lastTag; if (strcmp($lastTag, $currentTag) == 0) { $data = $lastdata . trim($data); $lastdata = $lastTag = ''; }else{ $lastdata = $data; $lastTag = $currentTag; return; } // <code for repair end> here can continue normal code for function characterData // see trim($data) must be there - the parser add to end of the string $data of first part CR(x0D) LF(x0E) (I think) and must be trimed for code to properly work. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=30887&edit=1