Edit report at https://bugs.php.net/bug.php?id=55459&edit=1

 ID:                 55459
 Updated by:         bj...@php.net
 Reported by:        scope at planetavent dot de
 Summary:            Unable to differentiate between end of file or read
                     error
-Status:             Open
+Status:             Wont fix
 Type:               Bug
 Package:            XML Reader
 Operating System:   Ubuntu 10.04 LTS
 PHP Version:        5.3.7
 Block user comment: N
 Private report:     N

 New Comment:

If an error occurred libxml_get_errors() will be populated with details on the 
error.


Previous Comments:
------------------------------------------------------------------------
[2011-08-19 09:56:22] scope at planetavent dot de

Description:
------------
We were forced into using xmlreader the other day due to xml file sizes.

Usually we use DOM to check for well-formedness and schema validity.

It seems, that it is currently not possible to differentiate between a reading 
error (e.g. not well formed) and "end of file" using xmlreader.

Unfortunately it is not possible to call XMLReader::isValid() after an 
unsuccessful read when using a schema. Of course, if the document is not 
well-formed it can't be valid. But isValid() just calls onto 
xmlTextReaderIsValid() which seems to work only, if the node pointer was able 
to advance correctly (which is not the case for that kind of error). So this is 
not an option.

>From ext/xmlreader/php_xmlreader.c:806
retval = xmlTextReaderRead(intern->ptr);
if (retval == -1) {
    php_error_docref(NULL TSRMLS_CC, E_WARNING, "An Error Occured while 
reading");
    RETURN_FALSE;
} else {
    RETURN_BOOL(retval);
}

and

>From ext/xmlreader/php_xmlreader.c:849
if (retval == -1) {
    php_error_docref(NULL TSRMLS_CC, E_WARNING, "An Error Occured while 
reading");
    RETURN_FALSE;
} else {
    RETURN_BOOL(retval);
}

According to libxml, the result of xmlTextReaderRead() is
"1 if the node was read successfully, 0 if there is no more nodes to read, or 
-1 in case of error".

Therefor PHP should return true, int(0) or false to be able to check for 
reading errors.

I'm not quite sure if this can be considered a bug. To my mind it is, because 
it prevents me from using xmlreader in a proper way and as a result, renders it 
very difficult to work on huge xml files.

libxml is able to distinguish between both conditions, so should php.

Test script:
---------------
<?php

libxml_use_internal_errors( true );

$file = "input.xml";

$xr = new XMLReader();
$xr->open( $file );

while ( true )
{
    $success = $xr->read();
    
    if ( !$success )
    {
        # end of file or error?
        var_dump( $success );
        echo "no success\n";
        break;
    }
    else
    {
        echo "read success\n";
    }
}


Expected result:
----------------
Ability to check for int(0) = no elements to read and false = error during read.

Actual result:
--------------
[...]
read success
read success
read success

Warning: XMLReader::read(): An Error Occured while reading in X:\xmlreader.php 
on line 12
bool(false)
no success


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=55459&edit=1

Reply via email to