On 7/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote: > I ended up with this doc in solr: > > <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"><int name="status">0</int><int > name="QTime">1</int><lst name="params"><str name="start">7</str><str > name="fl">content</str><str name="q">"Pez"~10000</str><str > name="rows">1</str></lst></lst><result name="response" > numFound="5381" start="7"><doc><str name="content">Akatsuki - PE'Z > ҳ | ̳ | պ | ŷ | >>> Akatsuki - PE'Z ר | и &nbsp| > Ů &nbsp| ֶ &nbsp| պ &nbsp| ¸ &nbsp| tӺ > &nbsp| Ϸ &nbsp| Ӱ &nbsp| ϼ &nbsp| ŷ> > &nbsp| ϸ &nbsp| ѵ ŷ> > Various Artists[2005] > Now > Jazz 3 - That's What I Call Jazz > Akatsuki - PE'Z Akatsuki - > PE'Z ר Now Jazz 3 - That's What I Call Jazz ݳ֣ Various > Artists[2005] Akatsuki - PE'Z ȱ ǻᾡ첹ȱĸʣ ҵ˸ø > Ӹø>>> һ/str></doc></result> > </response> > > > Note the missing < in </str> > > Solrj throws this (on a larger query that includes this doc): > Caused by: javax.xml.stream.XMLStreamException: ParseError at > [row,col]:[3,20624] > Message: The element type "str" must be terminated by the matching > end-tag "</str>". > > And firefox can't render it either, throws an error. > > So any query that returns this doc will cause an error. > > Obviously there's some weird stuff in this doc, but is it a solr > issue that the < got destroyed?
If the '<' truely got destroyed, it's a server (Solr or Jetty) bug. One possibility is that the '<' does exist, but due to a charset mismatch, it's being slurped into a multi-byte char. -Yonik