tag 500015 wontfix
severity 500015 minor
retitle 500015 Cannot parse feeds containing control characters

On Wed, Sep 24, 2008 at 04:00:37AM -0700, Matt Kraai wrote:
> Howdy,
> 
> The feed at
> 
>  http://jc.ngo.org.uk/~nik/use.perl.journals.rss
> 
> currently contains a SOH character (i.e., the 0x01 character).  When I
> click on it in Liferea, it displays the following error message:
> 
>  XML Parsing Error: reference to invalid character number
>  Location: file:///
>  Line Number 20, Column 45:
> 
>  <pre>Aha. On the line 580 of that we have a &#x1; character. Which leads me 
> to
>  --------------------------------------------^
> 
> The feed has a UTF-8 encoding declaration and the SOH character is a
> valid Unicode character, so I think this error is in error.

As a matter of fact, the XML spec says 
(http://www.w3.org/TR/REC-xml/#dt-character)
that

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

so &#x1; is not a valid char for an XML document.

I'll leave this open to serve as a FAQ but will set is as minor,wontfix.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to