Yup this information will need to be collected each time the user search
for a query, as we want to show the number of records that matches the
search query in each of the collections.

Currently I only have 6 collections, but it could increase to hundreds of
collections in the future. So I'm worried that it could slow down the
system a lot if we have to pass hundreds of queries for each search request.

Regards,
Edwin


On 5 June 2015 at 21:00, Upayavira <u...@odoko.co.uk> wrote:

> I'm not so sure this is as bad as it sounds. When your collection is
> sharded, no single node knows about the documents in other shards/nodes,
> so to find the total number, a query will need to go to every node.
>
> Trying to work out something to do a single request to every node,
> combine their collection statistics and aggregate them into a single
> result sounds very complicated, and likely overkill.
>
> Are you needing to collect this information often? Do you have a lot of
> collections?
>
> Upayavira
>
>
> On Fri, Jun 5, 2015, at 06:29 AM, Zheng Lin Edwin Yeo wrote:
> > I'm trying to write a SolrJ program in Java to read and consolidate all
> > the
> > information into a JSON file, The client will just need to call this
> > SolrJ
> > program and read this JSON file to get the details. But the problem is we
> > are still querying the Solr once for each collection, just that this time
> > it is done in the SolrJ program in a for-loop, while previously it's done
> > on the client side. Not sure will this lead to performance improvement?
> >
> > For your suggestion on spawning a bunch of threads, does it mean the same
> > thing as I did?
> >
> > Regards,
> > Edwin
> >
> >
> > On 5 June 2015 at 12:03, Erick Erickson <erickerick...@gmail.com> wrote:
> >
> > > Have you considered spawning a bunch of threads, one per collection
> > > and having them all run in parallel?
> > >
> > > Best,
> > > Erick
> > >
> > > On Thu, Jun 4, 2015 at 4:52 PM, Zheng Lin Edwin Yeo
> > > <edwinye...@gmail.com> wrote:
> > > > The reason we wanted to do a single call is to improve on the
> > > performance,
> > > > as our application requires to list the total number of records in
> each
> > > of
> > > > the collections, and the number of records that matches the query
> each of
> > > > the collections.
> > > >
> > > > Currently we are querying each collection one by one to retrieve the
> > > > numFound value and display them, but this can slow down the system
> > > > significantly when the number of collection grows. So we are
> thinking of
> > > > ways to improve the speed in this area.
> > > >
> > > > Any other methods which you can suggest that we can do to overcome
> this
> > > > speed problem?
> > > >
> > > > Regards,
> > > > Edwin
> > > > On 5 Jun 2015 00:16, "Erick Erickson" <erickerick...@gmail.com>
> wrote:
> > > >
> > > >> Not in a single call that I know of. These are really orthogonal
> > > >> concepts. Getting the cluster status merely involves reading the
> > > >> Zookeeper clusterstate whereas getting the total number of docs for
> > > >> each would involve querying each collection, i.e. going to the Solr
> > > >> nodes themselves. I'd guess it's unlikely to be combined.
> > > >>
> > > >> Best,
> > > >> Erick
> > > >>
> > > >> On Thu, Jun 4, 2015 at 7:47 AM, Zheng Lin Edwin Yeo
> > > >> <edwinye...@gmail.com> wrote:
> > > >> > Hi,
> > > >> >
> > > >> > Would like to check, are we able to use the Collection API or any
> > > other
> > > >> > method to list all the collections in the cluster together with
> the
> > > >> number
> > > >> > of records in each of the collections in one output?
> > > >> >
> > > >> > Currently, I only know of the List Collections
> > > >> > /admin/collections?action=LIST. However, this only list the names
> of
> > > the
> > > >> > collections that are in the cluster, but not the number of
> records.
> > > >> >
> > > >> > Is there a way to show the number of records in each of the
> > > collections
> > > >> as
> > > >> > well?
> > > >> >
> > > >> > Regards,
> > > >> > Edwin
> > > >>
> > >
>

Reply via email to