Thanks for the reply.

I am having a very huge index, so to retrieve older documents when not
needed definitely wastes time and also at the same time I would need to do
recency boosts/ time sort. So, I am looking for a way to avoid that.
Thats why I am in need to restrict my docset  and recently added ones. I
would not prefer to use the "rows" parameter for this.

Thanks,
pooja

On Mon, Jul 11, 2011 at 5:49 PM, Bob Sandiford <bob.sandif...@sirsidynix.com
> wrote:

> A good answer may also depend on WHY you are wanting to restrict to 500K
> documents.
>
> Are you seeking to reduce the time spent by Solr in determining the doc
> count?  Are you just wanting to prevent people from moving too far into the
> result set?  Is it case that you can only display 6 digits for your return
> count? :)
>
> If Solr is performing adequately, you could always just artificially
> restrict the result set.  Solr doesn't actually 'return' all 5M documents -
> it only returns the number you have specified in your query (as well as
> having some cache for the next results in anticipation of a subsequent
> query).  So, if the total count returned exceeds 500K, then just report 500K
> as the number of results, and similarly restrict how far a user can page
> through the results...
>
> (And - you can (and sounds like you should) sort your results by descending
> post date so that you do in fact get the most recent ones coming back
> first...)
>
> Bob Sandiford | Lead Software Engineer | SirsiDynix
> P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
> www.sirsidynix.com
>
>
> > -----Original Message-----
> > From: Ahmet Arslan [mailto:iori...@yahoo.com]
> > Sent: Monday, July 11, 2011 7:43 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Restricting the Solr Posting List (retrieved set)
> >
> >
> > > We want to search in an index in such a way that even if a
> > > clause has a long
> > > posting list - Solr should stop collecting documents for
> > > the clause
> > > after receiving X documents that match the clause.
> > >
> > > For example, if  for query "India",solr can return 5M
> > > documents, we would
> > > like to restrict the set at only 500K documents.
> > >
> > > The assumption is that since we are posting chronologically
> > > - we would like
> > > the X most recent documents to be matched for the clause
> > > only.
> > >
> > > Is it possible anyway?
> >
> > Looks like your use-case is suitable for time based sharding.
> > http://wiki.apache.org/solr/DistributedSearch
> >
> > Lets say you divide your shards according to months. You will have a
> > separate core for each month.
> > http://wiki.apache.org/solr/CoreAdmin
> >
> > When a query comes in, you will hit the most recent core. If you don't
> > obtain enough results add a new value (previous month core) to &shards=
> > parameter.
> >
>
>
>

Reply via email to