Yup this information will need to be collected each time the user search for a query, as we want to show the number of records that matches the search query in each of the collections.
Currently I only have 6 collections, but it could increase to hundreds of collections in the future. So I'm worried that it could slow down the system a lot if we have to pass hundreds of queries for each search request. Regards, Edwin On 5 June 2015 at 21:00, Upayavira <u...@odoko.co.uk> wrote: > I'm not so sure this is as bad as it sounds. When your collection is > sharded, no single node knows about the documents in other shards/nodes, > so to find the total number, a query will need to go to every node. > > Trying to work out something to do a single request to every node, > combine their collection statistics and aggregate them into a single > result sounds very complicated, and likely overkill. > > Are you needing to collect this information often? Do you have a lot of > collections? > > Upayavira > > > On Fri, Jun 5, 2015, at 06:29 AM, Zheng Lin Edwin Yeo wrote: > > I'm trying to write a SolrJ program in Java to read and consolidate all > > the > > information into a JSON file, The client will just need to call this > > SolrJ > > program and read this JSON file to get the details. But the problem is we > > are still querying the Solr once for each collection, just that this time > > it is done in the SolrJ program in a for-loop, while previously it's done > > on the client side. Not sure will this lead to performance improvement? > > > > For your suggestion on spawning a bunch of threads, does it mean the same > > thing as I did? > > > > Regards, > > Edwin > > > > > > On 5 June 2015 at 12:03, Erick Erickson <erickerick...@gmail.com> wrote: > > > > > Have you considered spawning a bunch of threads, one per collection > > > and having them all run in parallel? > > > > > > Best, > > > Erick > > > > > > On Thu, Jun 4, 2015 at 4:52 PM, Zheng Lin Edwin Yeo > > > <edwinye...@gmail.com> wrote: > > > > The reason we wanted to do a single call is to improve on the > > > performance, > > > > as our application requires to list the total number of records in > each > > > of > > > > the collections, and the number of records that matches the query > each of > > > > the collections. > > > > > > > > Currently we are querying each collection one by one to retrieve the > > > > numFound value and display them, but this can slow down the system > > > > significantly when the number of collection grows. So we are > thinking of > > > > ways to improve the speed in this area. > > > > > > > > Any other methods which you can suggest that we can do to overcome > this > > > > speed problem? > > > > > > > > Regards, > > > > Edwin > > > > On 5 Jun 2015 00:16, "Erick Erickson" <erickerick...@gmail.com> > wrote: > > > > > > > >> Not in a single call that I know of. These are really orthogonal > > > >> concepts. Getting the cluster status merely involves reading the > > > >> Zookeeper clusterstate whereas getting the total number of docs for > > > >> each would involve querying each collection, i.e. going to the Solr > > > >> nodes themselves. I'd guess it's unlikely to be combined. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >> On Thu, Jun 4, 2015 at 7:47 AM, Zheng Lin Edwin Yeo > > > >> <edwinye...@gmail.com> wrote: > > > >> > Hi, > > > >> > > > > >> > Would like to check, are we able to use the Collection API or any > > > other > > > >> > method to list all the collections in the cluster together with > the > > > >> number > > > >> > of records in each of the collections in one output? > > > >> > > > > >> > Currently, I only know of the List Collections > > > >> > /admin/collections?action=LIST. However, this only list the names > of > > > the > > > >> > collections that are in the cluster, but not the number of > records. > > > >> > > > > >> > Is there a way to show the number of records in each of the > > > collections > > > >> as > > > >> > well? > > > >> > > > > >> > Regards, > > > >> > Edwin > > > >> > > > >