The text file at: http://brockwine.com/solr.txt
Represents one of these truncated responses (this one in XML). It starts out great, then look at the bottom, boom, game over. :) I found this document by first running our bigger search which breaks and then zeroing in a specific broken document by using the rows/start parameters. But there are any unknown number of these "broken" documents - a lot I presume. -Rupert On Tue, Aug 25, 2009 at 9:40 AM, Avlesh Singh<avl...@gmail.com> wrote: > Can you copy-paste the source data indexed in this field which causes the > error? > > Cheers > Avlesh > > On Tue, Aug 25, 2009 at 10:01 PM, Rupert Fiasco <rufia...@gmail.com> wrote: > >> Using wt=json also yields an invalid document. So after more >> investigation it appears that I can always "break" the response by >> pulling back a specific field via the "fl" parameter. If I leave off a >> field then the response is valid, if I include it then Solr yields an >> invalid document - a truncated document. This happens in any response >> format (xml, json, ruby). >> >> I am using the SolrJ client to add documents to in my index. My field >> is a normal "text" field type and the text itself is the first 1000 >> characters of an article. >> >> > It can very well be an issue with the data itself. For example, if the >> data >> > contains un-escaped characters which invalidates the response >> >> When I look at the document in using wt=xml then all XML entities are >> escaped. When I look at it under wt=ruby then all single quotes are >> escaped, same for json, so it appears that all escaping it taking >> place. The core problem seems to be that the document is just >> truncated - it just plain end of files. Jetty's log says its sending >> back an HTTP 200 so all is well. >> >> Any ideas on how I can dig deeper? >> >> Thanks >> -Rupert >> >> >> On Mon, Aug 24, 2009 at 4:31 PM, Uri Boness<ubon...@gmail.com> wrote: >> > It can very well be an issue with the data itself. For example, if the >> data >> > contains un-escaped characters which invalidates the response. I don't >> know >> > much about ruby, but what do you get with wt=json? >> > >> > Rupert Fiasco wrote: >> >> >> >> I am seeing our responses getting truncated if and only if I search on >> >> our main text field. >> >> >> >> E.g. I just do some basic like >> >> >> >> title_t:arthritis >> >> >> >> Then I get a valid document back. But if I add in our larger text field: >> >> >> >> title_t:arthritis OR text_t:arthritis >> >> >> >> then the resultant document is NOT valid XML (if using wt=xml) or Ruby >> >> (using wt=ruby). If I run these through curl on the command its >> >> truncated and if I run the search through the web-based admin panel >> >> then I get an XML parse error. >> >> >> >> This appears to have just started recently and the only thing we have >> >> done is change our indexer from a PHP one to a Java one, but >> >> functionally they are identical. >> >> >> >> Any thoughts? Thanks in advance. >> >> >> >> - Rupert >> >> >> >> >> > >> >