Re: solr admin result page error

2011-02-25 Thread Bernd Fehling
Hi Markus, the result of my investigation is that Lucene currently can only handle UTF-8 code within BMP [Basic Multilingual Plane] (plane 0) <= 0x. Any code above BMP might end in unpredictable results which is bad. If you get invalid UTF-8 from the index and use wt=xml it gives the error pa

Re: solr admin result page error

2011-02-11 Thread Markus Jelsma
No i haven't located the issue. It might be Solr but it could also be Xerces having trouble with it. You can possibly work around the problem by using the JSONResponseWriter. On Friday 11 February 2011 15:45:23 Bernd Fehling wrote: > Hi Markus, > > yes it looks like the same issue. There is als

Re: solr admin result page error

2011-02-11 Thread Bernd Fehling
Hi Markus, yes it looks like the same issue. There is also a \u utf8-code in your dump. Till now I followed it into XMLResponseWriter. Some steps before the result in a buffer looks good and the utf8-code is correct. Really hard to debug this freaky problem. Have you looked deeper into this

Re: solr admin result page error

2011-02-11 Thread Markus Jelsma
It looks like you hit the same issue as i did a while ago: http://www.mail-archive.com/solr-user@lucene.apache.org/msg46510.html On Friday 11 February 2011 08:59:27 Bernd Fehling wrote: > Dear list, > > after loading some documents via DIH which also include urls > I get this yellow XML error pa

Re: solr admin result page error

2011-02-11 Thread Bernd Fehling
Results so far. I could locate and isolate the document causing trouble. I've checked the document with xmllint again. It is valid, well-formed utf8. I've loaded the single document and get the XML error if displaying the search result. This is through solr admin search and also JSON interface, p