Re: XML parsing error

Marc Worrell - Mediamatic Thu, 26 Jul 2007 08:09:04 -0700

Looks to me as if your document is not valid UTF-8 and is missing onebyte at the end.

Then the '<' of '</str>' is included into the previous character.

Did you create the text snippet yourself? Maybe check if the stringfunctions you are using are multi-byte aware.


Greetings, Marc


On 26-jul-2007, at 16:55, Brian Whitman wrote:

I ended up with this doc in solr:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><intname="QTime">1</int><lst name="params"><str name="start">7</str><str name="fl">content</str><str name="q">"Pez"~10000</str><strname="rows">1</str></lst></lst><result name="response"numFound="5381" start="7"><doc><str name="content">Akatsuki -PE'Z ҳ | ̳ | պ | ŷ | >>> Akatsuki - PE'Z ר | и&nbsp| Ů &nbsp| ֶ &nbsp| պ &nbsp| ¸ &nbsp|tӺ &nbsp| Ϸ &nbsp| Ӱ &nbsp| ϼ &nbsp| ŷ>&nbsp| ϸ &nbsp| ѵ ŷ> > Various Artists[2005] >Now Jazz 3 - That's What I Call Jazz > Akatsuki - PE'Z Akatsuki- PE'Z ר Now Jazz 3 - That's What I Call Jazz ݳ֣ VariousArtists[2005] Akatsuki - PE'Z ȱ ǻᾡ첹ȱĸʣ ҵ˸øӸø>>> һ񈐼/str></doc></result>
</response>


Note the missing < in </str>

Solrj throws this (on a larger query that includes this doc):
Caused by: javax.xml.stream.XMLStreamException: ParseError at[row,col]:[3,20624]Message: The element type "str" must be terminated by the matchingend-tag "</str>".
And firefox can't render it either, throws an error.

So any query that returns this doc will cause an error.
Obviously there's some weird stuff in this doc, but is it a solrissue that the < got destroyed?

Re: XML parsing error

Reply via email to