Re: how to get all the docIds in the search result?

Otis Gospodnetic Thu, 23 Jul 2009 09:07:05 -0700

And if I may add another thing - if you are using Solr in this fashion, have a 
look at your caches, esp. document cache. If your "queries" of this type are 
repeated, you may benefit from large cache.  Or, if they are not, you may 
completely disable some caches.


 Otis
--
Sematext is hiring: http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Erik Hatcher <[email protected]>
> To: [email protected]
> Sent: Thursday, July 23, 2009 11:15:45 AM
> Subject: Re: how to get all the docIds in the search result?
> 
> Rather than trying to get all document id's in one call to Solr, consider 
> paging 
> through the results.  Set rows=1000 or probably larger, then check the 
> numFound 
> and continue making requests to Solr incrementing start parameter accordingly 
> until done.
> 
>     Erik
> 
> On Jul 23, 2009, at 5:35 AM, shb wrote:
> 
> > I have tried the following code:
> > query.setRows(Integer.MAX_VALUE);
> > query.setFields("id");
> > 
> > when it return 1000,000 records, it will take about 22s.
> > This is very slow. Is there any other way?
> > 
> > 
> > 2009/7/23 Toby Cole 
> > 
> >> Have you tried limiting the fields that you're requesting to just the ID?
> >> Something along the line of:
> >> 
> >> query.setRows(Integer.MAX_VALUE);
> >> query.setFields("id");
> >> 
> >> Might speed the query up a little.
> >> 
> >> 
> >> On 23 Jul 2009, at 09:11, shb wrote:
> >> 
> >> Here id is indeed the uniqueKey of a document.
> >>> I want to get all the ids  for some other  useage.
> >>> 
> >>> 
> >>> 2009/7/23 Shalin Shekhar Mangar 
> >>> 
> >>> On Thu, Jul 23, 2009 at 1:09 PM, shb wrote:
> >>>> 
> >>>> if I use query.setRows(Integer.MAX_VALUE);
> >>>>> the query will become very slow, because searcher will go
> >>>>> to fetch the filed value in the index for all the returned
> >>>>> document.
> >>>>> 
> >>>>> So if I set query.setRows(10), is there any other ways to
> >>>>> get all the ids? thanks
> >>>>> 
> >>>>> 
> >>>> You should fetch as many rows as you need and not more. Why do you need
> >>>> all
> >>>> the ids? I'm assuming that by id you mean the uniqueKey of a document.
> >>>> 
> >>>> --
> >>>> Regards,
> >>>> Shalin Shekhar Mangar.
> >>>> 
> >>>> 
> >> --
> >> 
> >> Toby Cole
> >> Software Engineer, Semantico Limited
> >> 
> >> Registered in England and Wales no. 03841410, VAT no. GB-744614334.
> >> Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.
> >> 
> >> Check out all our latest news and thinking on the Discovery blog
> >> http://blogs.semantico.com/discovery-blog/
> >> 
> >>

Re: how to get all the docIds in the search result?

Reply via email to