Hi Sandeep, You are quite likely below capacity with this current set-up: http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache
Few things for you to confirm: 1. Which version of Solr are you using? 2. The size of your index. - Are fields stored? How much are these stored fields contributing to the overall index size (File types: http://lucene.apache.org/core/2_9_4/fileformats.html#file-names). - You are not bloating the index further with term vectors, norms, ngrams, reverse wild card, etc. 3. Response time (Solr & client side) with your typical queries. Also utilization numbers for memory, CPU. For your modelling, if possible, you could consider grouping the regions, and searching via one regions-group-id in place of 250+ region ids (in an OR query, not in an "IN param"). Regards, Aloke On Thu, Oct 24, 2013 at 8:25 PM, Joel Bernstein <joels...@gmail.com> wrote: > Sandeep, > > This type of operation can often be expressed as a PostFilter very > efficiently. This is particularly true if the region id's are integer keys. > > Joel > > On Thu, Oct 24, 2013 at 7:46 AM, Sandeep Gupta <sandy....@gmail.com> > wrote: > > > Hi, > > > > We have a Solr index of around 100 million documents with each document > > being given a region id growing at a rate of about 10 million documents > per > > month - the average document size being aronud 10KB of pure text. The > total > > number of region ids are themselves in the range of 2.5 million. > > > > We want to search for a query with a given list of region ids. The number > > of region ids in this list is usually around 250-300 (most of the time), > > but can be upto 500, with a maximum cap of around 2000 ids in one > request. > > > > > > What is the best way to model such queries besides using an IN param in > the > > query, or using a Filter FQ in the query? Are there any other faster > > methods available? > > > > > > If it may help, the index is on a VM with 4 virtual-cores and has > currently > > 4GB of Java memory allocated out of 16GB in the machine. The number of > > queries do not exceed more than 1 per minute for now. If needed, we can > > throw more hardware to the index - but the index will still be only on a > > single machine for atleast 6 months. > > > > Regards, > > Sandeep Gupta > > > > > > -- >