ID:               46129
 Updated by:       [EMAIL PROTECTED]
 Reported By:      brett at brettbrewer dot com
-Status:           Open
+Status:           Bogus
 Bug Type:         SimpleXML related
 Operating System: Linux xq41.cyberlnc.com 2.6.18-5
 PHP Version:      5.2.6
 New Comment:

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Its UTF-8 so either convert the data to ISO-8859-1 or fix your HTML.




Previous Comments:
------------------------------------------------------------------------

[2008-09-19 19:26:24] brett at brettbrewer dot com

Description:
------------
When parsing an XML feed (wordpress) containing the character codes for
a right single curly quote (’), the character is converted into
’. Unfortunately I'm not able to get complete access to the server to
deactivate Zend optimizer, Ioncube, etc and I'm pulling the OS info from
phpinfo(). I've included the URL of the actual feed that is causing the
problems. I found a really old similar bug report for php 4.3.2, but
nothing for PHP5.Here's the old bug report URL:

http://bugs.php.net/bug.php?id=24863&edit=2
I also found:
http://bugs.php.net/bug.php?id=26964&edit=2

which suggest a similar problem with htmlentities and
html_entity_decode but I don't know if it's related. I'm sure my feed is
UTF-8 and if I convert it to ISO9xxx-1 before passing it to my SimpleXML
object then SimpleXML complains that it's not in UTF-8 format and
aborts, so I'm pretty sure it's not a UTF8 encoding issue with the feed.
I've included the feed url in the code sample below. It assumes it is
inside a class, but you can probably run the code below to reproduce the
symptoms just by removing the "this->" in two places.  

Reproduce code:
---------------
$this->blog_url = "http://75.126.106.225/blog/feed/";;
$rawFeed = file_get_contents($this->blog_url);
$xml = new SimpleXmlElement($rawFeed); 

//you can see the results of the incorrect parsing of the feed in the
left sidebar at http://75.126.106.225

Expected result:
----------------
Code should keep the ’ entity code intact or possibly convert it
to '

Actual result:
--------------
SimpleXML contstructor seems to convert all instances of ’ into
’

If you use SimpleXML to parse the feed at
http://75.126.106.225/blog/feed/ you should see the problem in the
<title> of the second item in the feed. 


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=46129&edit=1

Reply via email to