On 7/25/2013 4:45 PM, Tom Burton-West wrote:
Thanks for your help. I found a workaround for this use case, which is to
avoid using a shards query and just asking each shard for a dump of the
unique ids. i.e. run an *:* query and ask for 1 million rows at a time.
This should be a no scoring query, so I would think that it doesn't have to
do any ranking or sorting. What I am now seeing is that qtimes have gone
up from about 5 seconds per request to nearly a minute as the start
parameter gets higher. I don't know if this is actually because of the
start parameter or if something is happening with memory use and/or caching
that is just causing things to take longer. I'm at around 35 out of 119
million for this shard and queries have gone from taking 5 seconds to
taking almost a minute.
INFO: [core] webapp=/dev-1 path=/select
params={fl=vol_id&indent=on&start=36000000&q=*:*&rows=1000000}
hits=119220943 status=0 QTime=52952
Sounds like your servers are handling deep paging far better than I
would have guessed. I've seen people talk about exponential query time
growth from deep paging after only a few pages. Your times are going
up, but the increase is *relatively* slow, and you've made it 36 pages in.
Getting the information as you're doing it now will be slow, but
probably reliable. Moving to non-distributed requests against the
individual shards was a good idea.
From my own testing: By bumping my max heap on my dev server from 7GB
to 9GB, I was able to get a million row result (distributed) in only
four minutes, whereas it had reached 45 minutes before with no end in
sight. It was having huge GC pauses from extremely frequent full GCs.
That problem persisted after the heap increase, but it wasn't as bad,
and I was also dealing with the fact that my OS disk cache on the dev
server is way too small.
Thanks,
Shawn