Thanks for the reply. I am having a very huge index, so to retrieve older documents when not needed definitely wastes time and also at the same time I would need to do recency boosts/ time sort. So, I am looking for a way to avoid that. Thats why I am in need to restrict my docset and recently added ones. I would not prefer to use the "rows" parameter for this.
Thanks, pooja On Mon, Jul 11, 2011 at 5:49 PM, Bob Sandiford <bob.sandif...@sirsidynix.com > wrote: > A good answer may also depend on WHY you are wanting to restrict to 500K > documents. > > Are you seeking to reduce the time spent by Solr in determining the doc > count? Are you just wanting to prevent people from moving too far into the > result set? Is it case that you can only display 6 digits for your return > count? :) > > If Solr is performing adequately, you could always just artificially > restrict the result set. Solr doesn't actually 'return' all 5M documents - > it only returns the number you have specified in your query (as well as > having some cache for the next results in anticipation of a subsequent > query). So, if the total count returned exceeds 500K, then just report 500K > as the number of results, and similarly restrict how far a user can page > through the results... > > (And - you can (and sounds like you should) sort your results by descending > post date so that you do in fact get the most recent ones coming back > first...) > > Bob Sandiford | Lead Software Engineer | SirsiDynix > P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com > www.sirsidynix.com > > > > -----Original Message----- > > From: Ahmet Arslan [mailto:iori...@yahoo.com] > > Sent: Monday, July 11, 2011 7:43 AM > > To: solr-user@lucene.apache.org > > Subject: Re: Restricting the Solr Posting List (retrieved set) > > > > > > > We want to search in an index in such a way that even if a > > > clause has a long > > > posting list - Solr should stop collecting documents for > > > the clause > > > after receiving X documents that match the clause. > > > > > > For example, if for query "India",solr can return 5M > > > documents, we would > > > like to restrict the set at only 500K documents. > > > > > > The assumption is that since we are posting chronologically > > > - we would like > > > the X most recent documents to be matched for the clause > > > only. > > > > > > Is it possible anyway? > > > > Looks like your use-case is suitable for time based sharding. > > http://wiki.apache.org/solr/DistributedSearch > > > > Lets say you divide your shards according to months. You will have a > > separate core for each month. > > http://wiki.apache.org/solr/CoreAdmin > > > > When a query comes in, you will hit the most recent core. If you don't > > obtain enough results add a new value (previous month core) to &shards= > > parameter. > > > > >