Re: Solr 4.2.1 limit on number of rows or number of hits per shard?

Shawn Heisey Thu, 25 Jul 2013 15:19:50 -0700

On 7/25/2013 3:09 PM, Tom Burton-West wrote:

Thanks Shawn,


I was confused by the error message: "Invalid version (expected 2, but 60)
or the data in not in 'javabin' format"

Your explanation makes sense.  I didn't think about what the shards have to
send back to the head shard.
Now that I look in my logs, I can see the posts that  the shards are
sending to the head shard and actually get a good measure of how many bytes
are being sent around.

I'll poke around and look at multipartUploadLimitInKB, and also see if
there is some servlet container limit config I might need to mess with.

I think I figured it out, after a peek at the source code. I upgradedto Solr 4.4 first, my 100,000 row query still didn't work. By settingformdataUploadLimitInKB (in addition to multipartUploadLimitInKB, notsure if both are required), I was able to get a 100,000 row query to work.

A query for one million rows did finally respond to my browser query,but it took a REALLY REALLY long time (82 million docs in severalshards, only 16GB RAM on the dev server) and it crashed firefox due tothe size of the response. It also seemed to error out on some of theshard responses. My handler has shards.tolerant=true, so that didn'tseem to kill the whole query ... but because the response crashedfirefox, I couldn't tell.

I repeated the query using curl so I could save the response. It's beenrunning for several minutes without any server-side errors, but I stilldon't have any results.

Your servers are much more robust than my little dev server, so thismight work for you - if you aren't using the start parameter in additionto the rows parameter. You might need to sort ascending by your uniquekey field and use a range query ([* TO *] for the first one), find thehighest value in the response, and then send a targeted range query (thevalue {max_from_last_run TO *] would work) asking for the next millionrecords.


Thanks,
Shawn

Re: Solr 4.2.1 limit on number of rows or number of hits per shard?

Reply via email to