Re: Responses getting truncated

Uri Boness Tue, 25 Aug 2009 12:19:48 -0700

Hi,

This is a very strange behavior and the fact that it is cause by onespecific field, again, leads me to believe it's still a data issue. Didyou try using SolrJ to query the data as well? If the same thing happenswhen using the binary protocol, then it's probably not a data issue. Onthe other hand, if it works fine, then at least you can inspect the datato see where things go wrong. Sorry for insisting on that, but I cannotthink of anything else that can cause this problem.

If anyone else have a better idea, I'm actually very curious to hearabout it.


Uri

Rupert Fiasco wrote:

The text file at:

http://brockwine.com/solr.txt

Represents one of these truncated responses (this one in XML). It
starts out great, then look at the bottom, boom, game over. :)

I found this document by first running our bigger search which breaks
and then zeroing in a specific broken document by using the rows/start
parameters. But there are any unknown number of these "broken"
documents - a lot I presume.

-Rupert

On Tue, Aug 25, 2009 at 9:40 AM, Avlesh Singh<avl...@gmail.com> wrote:

Can you copy-paste the source data indexed in this field which causes the
error?

Cheers
Avlesh

On Tue, Aug 25, 2009 at 10:01 PM, Rupert Fiasco <rufia...@gmail.com> wrote:

Using wt=json also yields an invalid document. So after more
investigation it appears that I can always "break" the response by
pulling back a specific field via the "fl" parameter. If I leave off a
field then the response is valid, if I include it then Solr yields an
invalid document - a truncated document. This happens in any response
format (xml, json, ruby).

I am using the SolrJ client to add documents to in my index. My field
is a normal "text" field type and the text itself is the first 1000
characters of an article.

It can very well be an issue with the data itself. For example, if the

data

contains un-escaped characters which invalidates the response

When I look at the document in using wt=xml then all XML entities are
escaped. When I look at it under wt=ruby then all single quotes are
escaped, same for json, so it appears that all escaping it taking
place. The core problem seems to be that the document is just
truncated - it just plain end of files. Jetty's log says its sending
back an HTTP 200 so all is well.

Any ideas on how I can dig deeper?

Thanks
-Rupert


On Mon, Aug 24, 2009 at 4:31 PM, Uri Boness<ubon...@gmail.com> wrote:

It can very well be an issue with the data itself. For example, if the

data

contains un-escaped characters which invalidates the response. I don't

know

much about ruby, but what do you get with wt=json?

Rupert Fiasco wrote:

I am seeing our responses getting truncated if and only if I search on
our main text field.

E.g. I just do some basic like

title_t:arthritis

Then I get a valid document back. But if I add in our larger text field:

title_t:arthritis OR text_t:arthritis

then the resultant document is NOT valid XML (if using wt=xml) or Ruby
(using wt=ruby). If I run these through curl on the command its
truncated and if I run the search through the web-based admin panel
then I get an XML parse error.

This appears to have just started recently and the only thing we have
done is change our indexer from a PHP one to a Java one, but
functionally they are identical.

Any thoughts? Thanks in advance.

- Rupert

Re: Responses getting truncated

Reply via email to