take a look here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
looking at memory consumption can be a bit tricky to interpret with MMapDirectory. But you say "I see the CPU working very hard" which implies that your issue is just scoring 90M documents. A way to test: try q=*:*&fq=field:book. My bet is that that will be much faster, in which case scoring is your choke-point and you'll need to spread that load across more servers, i.e. shard. When running the above, make sure of a couple of things: 1> you haven't run the fq query before (or you have filterCache turned completely off). 2> you _have_ run a query or two that warms up your low-level caches. Doesn't matter what, just as long as it doesn't have an fq clause. Best Erick On Sat, Mar 23, 2013 at 3:10 AM, David Parks <davidpark...@yahoo.com> wrote: > I see the CPU working very hard, and at the same time I see 2 MB/sec disk > access for that 15 seconds. I am not running it this instant, but it seems > to me that there was more CPU cycles available, so unless it's an issue of > not being able to multithread it any further I'd say it's more IO related. > > I'm going to set up solr cloud and shard across the 2 servers I have > available for now. It's not an optimal setup we have while we're in a > private beta period, but maybe it'll improve things (I've got 2 servers > with > 2x 4TB disks in raid-0 shared with the webservers). > > I'll work towards some improved IO performance and maybe more shards and > see > how things go. I'll also be able to up the RAM in just a couple of weeks. > > Are there any settings I should think of in terms of improving cache > performance when I can give it say 10GB of RAM? > > Thanks, this has been tremendously helpful. > > David > > > -----Original Message----- > From: Tom Burton-West [mailto:tburt...@umich.edu] > Sent: Saturday, March 23, 2013 1:38 AM > To: solr-user@lucene.apache.org > Subject: Re: Slow queries for common terms > > Hi David and Jan, > > I wrote the blog post, and David, you are right, the problem we had was > with > phrase queries because our positions lists are so huge. Boolean > queries don't need to read the positions lists. I think you need to > determine whether you are CPU bound or I/O bound. It is possible that > you are I/O bound and reading the term frequency postings for 90 million > docs is taking a long time. In that case, More memory in the machine (but > not dedicated to Solr) might help because Solr relies on OS disk caching > for > caching the postings lists. You would still need to do some cache warming > with your most common terms. > > On the other hand as Jan pointed out, you may be cpu bound because Solr > doesn't have early termination and has to rank all 90 million docs in order > to show the top 10 or 25. > > Did you try the OR search to see if your CPU is at 100%? > > Tom > > On Fri, Mar 22, 2013 at 10:14 AM, Jan Høydahl <jan....@cominvent.com> > wrote: > > > Hi > > > > There might not be a final cure with more RAM if you are CPU bound. > > Scoring 90M docs is some work. Can you check what's going on during > > those > > 15 seconds? Is your CPU at 100%? Try an (foo OR bar OR baz) search > > which generates >100mill hits and see if that is slow too, even if you > > don't use frequent words. > > > > I'm sure you can find other frequent terms in your corpus which > > display similar behaviour, words which are even more frequent than > > "book". Are you using "AND" as default operator? You will benefit from > > limiting the number of results as much as possible. > > > > The real solution is to shard across N number of servers, until you > > reach the desired performance for the desired indexing/querying load. > > > > -- > > Jan Høydahl, search solution architect Cominvent AS - > > www.cominvent.com Solr Training - www.solrtraining.com > > > > > >