Re: Solr 4.2.1 limit on number of rows or number of hits per shard?

Shawn Heisey Thu, 25 Jul 2013 11:27:32 -0700

On 7/25/2013 11:39 AM, Tom Burton-West wrote:

Hello,


I am running solr 4.2.1 on 3 shards and have about 365 million documents in
the index total.
I sent a query asking for 1 million rows at a time,  but I keep getting an
error claiming that there is an invalid version or data not in javabin
format (see below)

If I lower the number of rows requested to 100,000, I have no problems.

Does Solr have  a limit on number of rows that can be requested or is this
a bug?

That particular javabin error (expected 2, but 60) usually means thatthe response it got was something other than javabin, typically HTML or XML.

I was going to say that you should hopefully get a more meaningful errormessage from the server log, but it appears that what you included *IS*the server log, so I'm really confused. The error message you'regetting is typically something you see on the *client* side.

After some testing on my server, I suspect that what's happening here isthat the initial shard query (the one with fl=uniqueKeyField,score) isworking, but then when Solr makes the HUGE subsequent requests for theactual documents it is interested in, the list is too big to fit in theserver-side POST buffer, which defaults to 2MB. Those queries need tobe big enough to include an "ids" parameter that is a comma-separatedlist of values from your uniqueKey. In my case, each of those valuescould be 32 characters, so the id list could be up to 33MB for a millionof them. Most of them are significantly shorter, so a 32MB buffer wouldbe big enough.

Either multipartUploadLimitInKB doesn't work properly, or there may besome hard limits built into the servlet container, because I setmultipartUploadLimitInKB in the requestDispatcher config to 32768 and itstill didn't work. I wonder, perhaps there is a client-side POST bufferlimit as well as the servlet container limit, which comes in to playbecause the Solr server is acting as a client for the distributed requests?


Thanks,
Shawn

Re: Solr 4.2.1 limit on number of rows or number of hits per shard?

Reply via email to