From:             maddam at volny dot cz
Operating system: Win XP
PHP version:      5.0.2
PHP Bug Type:     *XML functions
Bug description:  XML parser stop at data when get first character as (áéíøèø 
ans so ...)

Description:
------------
xml file element:
<data>Jak se máš holoubátko ?</data>
This is Czech language with special characters.

<?php
function characterData($parser, $data)
$getdata = $data;

echo $getdata must show 'Jak se máš holoubátko ?'

But the parser stop and $getdata will consist of 'Jak se m'
and at the next step on same element parser will get all last text and
$getdata will consist of 'áš holoubátko ?'

This is bug for 5.0.2. In 4.3.9 and sooner is all OK. 5.0.0 and 5.0.1 i
was not tested.

Description: The parser when get data with language characters as
(ìšèøžýáíé) will cut this data to two parts. First part consist of
characters to first occurence of some character (ìšèøžýáíé) and second
part consist of spare element.

THIS BUG WILL NOT SHOW FOR ENGLISH LANGUAGE WHICH NOT USE CHARACTERS AS
ìšèøžýáíé

Sorry for my english i hope you understand. Contact me at [EMAIL PROTECTED]
or ICQ 25684007

Reproduce code:
---------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE rokam [
        <!ELEMENT data (#PCDATA)>
]>
<rokam>
 <data>Jak se máš</data>
 <data>Zde doma je dobrý ocet</data>
</rokam>

<?php
function characterData($parser, $data)
$getdata = $data;
echo $getdata . <br />;

Expected result:
----------------
Echo on screen, need two steps through function characterData:

Jak se máš
Zde doma je dobrý ocet

Actual result:
--------------
This output of parser 5.0.2 need four steps through function characterData
and will output:


Jak se m
áš
Zde doma je dobr
ý ocet

-------------------------------------------------
This BUG can be repaired with this code, who connect two parts from parser
to one variable say $data. This code
connect 'Jak se m' with 'áš'

function characterData($parser, $data) {
        global $currentTag;

// <code for repair start>
        global $lastdata, $lastTag;
        if (strcmp($lastTag, $currentTag) == 0) {
            $data = $lastdata . trim($data);
            $lastdata = $lastTag = '';
        }else{
            $lastdata = $data;
            $lastTag = $currentTag;
            return;
        }
// <code for repair end>

here can continue normal code for function characterData
//

see trim($data) must be there - the parser add to end of the string $data
of first part CR(x0D) LF(x0E) (I think)  and must be trimed for code to
properly work.


-- 
Edit bug report at http://bugs.php.net/?id=30887&edit=1
-- 
Try a CVS snapshot (php4):   http://bugs.php.net/fix.php?id=30887&r=trysnapshot4
Try a CVS snapshot (php5.0): 
http://bugs.php.net/fix.php?id=30887&r=trysnapshot50
Try a CVS snapshot (php5.1): 
http://bugs.php.net/fix.php?id=30887&r=trysnapshot51
Fixed in CVS:                http://bugs.php.net/fix.php?id=30887&r=fixedcvs
Fixed in release:            http://bugs.php.net/fix.php?id=30887&r=alreadyfixed
Need backtrace:              http://bugs.php.net/fix.php?id=30887&r=needtrace
Need Reproduce Script:       http://bugs.php.net/fix.php?id=30887&r=needscript
Try newer version:           http://bugs.php.net/fix.php?id=30887&r=oldversion
Not developer issue:         http://bugs.php.net/fix.php?id=30887&r=support
Expected behavior:           http://bugs.php.net/fix.php?id=30887&r=notwrong
Not enough info:             
http://bugs.php.net/fix.php?id=30887&r=notenoughinfo
Submitted twice:             
http://bugs.php.net/fix.php?id=30887&r=submittedtwice
register_globals:            http://bugs.php.net/fix.php?id=30887&r=globals
PHP 3 support discontinued:  http://bugs.php.net/fix.php?id=30887&r=php3
Daylight Savings:            http://bugs.php.net/fix.php?id=30887&r=dst
IIS Stability:               http://bugs.php.net/fix.php?id=30887&r=isapi
Install GNU Sed:             http://bugs.php.net/fix.php?id=30887&r=gnused
Floating point limitations:  http://bugs.php.net/fix.php?id=30887&r=float
MySQL Configuration Error:   http://bugs.php.net/fix.php?id=30887&r=mysqlcfg

Reply via email to