HI Erik, we have used document routing to balance the shards load and for expand/collapse. it is mainly used for main_collection which holds one to many relationship records. In file_collection, it is only for load distribution.
25GB for entire solr service. each machine will act as shard for some collections. we have not stress tested our servers at least for solr service. i have read the the link you have shared, i will do something on it. thanks for sharing. i have checked other collections, where index size is max 90GB and 5 M as max number of documents. but for the particular file_collection_2014 , i see total index size across replicas is 147 GB. Can we get any hints if we run the query with debugQuery=true ? what is the effective way of load distribution ? Please advice. Regards, Anil On 14 March 2016 at 20:32, Erick Erickson <erickerick...@gmail.com> wrote: > bq: The slowness is happening for file_collection. though it has 3 shards, > documents are available in 2 shards. shard1 - 150M docs and shard2 has 330M > docs , shard3 is empty. > > Well, this collection terribly balanced. Putting 330M docs on a single > shard is > pushing the limits, the only time I've seen that many docs on a shard, > particularly > with 25G of ram, they were very small records. My guess is that you will > find > the queries you send to that shard substantially slower than the 150M > shard, > although 150M could also be pushing your limits. You can measure this > by sending the query to the specific core (something like > > solr/files_shard1_replica1/query?(your queryhere)&distrib=false > > My bet is that your QTime will be significantly different with the two > shards. > > It also sounds like you're using implicit routing where you control where > the > files go, it's easy to have unbalanced shards in that case, why did you > decide > to do it this way? There are valid reasons, but... > > In short, my guess is that you've simply overloaded your shard with > 330M docs. It's > not at all clear that even 150 will give you satisfactory performance, > have you stress > tested your servers? Here's the long form of sizing: > > > https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ > > Best, > Erick > > On Mon, Mar 14, 2016 at 7:05 AM, Susheel Kumar <susheel2...@gmail.com> > wrote: > > For each of the solr machines/shards you have. Thanks. > > > > On Mon, Mar 14, 2016 at 10:04 AM, Susheel Kumar <susheel2...@gmail.com> > > wrote: > > > >> Hello Anil, > >> > >> Can you go to Solr Admin Panel -> Dashboard and share all 4 memory > >> parameters under System / share the snapshot. ? > >> > >> Thanks, > >> Susheel > >> > >> On Mon, Mar 14, 2016 at 5:36 AM, Anil <anilk...@gmail.com> wrote: > >> > >>> HI Toke and Jack, > >>> > >>> Please find the details below. > >>> > >>> * How large are your 3 shards in bytes? (total index across replicas) > >>> -- *146G. i am using CDH (cloudera), not sure how to check > the > >>> index size of each collection on each shard* > >>> * What storage system do you use (local SSD, local spinning drives, > remote > >>> storage...)? *Local (hdfs) spinning drives* > >>> * How much physical memory does your system have? *we have 15 data > nodes. > >>> multiple services installed on each data node (252 GB RAM for each data > >>> node). 25 gb RAM allocated for solr service.* > >>> * How much memory is free for disk cache? *i could not find.* > >>> * How many concurrent queries do you issue? *very less. i dont see any > >>> concurrent queries to this file_collection for now.* > >>> * Do you update while you search? *Yes.. its very less.* > >>> * What does a full query (rows, faceting, grouping, highlighting, > >>> everything) look like? *for the file_collection, rows - 100, > highlights = > >>> false, no facets, expand = false.* > >>> * How many documents does a typical query match (hitcount)? *it varies > >>> with > >>> each file. i have sort on int field to order commands in the query.* > >>> > >>> we have two sets of collections on solr cluster ( 17 data nodes) > >>> > >>> 1. main_collection - collection created per year. each collection uses > 8 > >>> shards 2 replicas ex: main_collection_2016, main_collection_2015 etc > >>> > >>> 2. file_collection (where files having commands are indexed) - > collection > >>> created per 2 years. it uses 3 shards and 2 replicas. ex : > >>> file_collection_2014, file_collection_2016 > >>> > >>> The slowness is happening for file_collection. though it has 3 shards, > >>> documents are available in 2 shards. shard1 - 150M docs and shard2 has > >>> 330M > >>> docs , shard3 is empty. > >>> > >>> main_collection is looks good. > >>> > >>> please let me know if you need any additional details. > >>> > >>> Regards, > >>> Anil > >>> > >>> > >>> On 13 March 2016 at 21:48, Anil <anilk...@gmail.com> wrote: > >>> > >>> > Thanks Toke and Jack. > >>> > > >>> > Jack, > >>> > > >>> > Yes. it is 480 million :) > >>> > > >>> > I will share the additional details soon. thanks. > >>> > > >>> > > >>> > Regards, > >>> > Anil > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > On 13 March 2016 at 21:06, Jack Krupansky <jack.krupan...@gmail.com> > >>> > wrote: > >>> > > >>> >> (We should have a wiki/doc page for the "usual list of suspects" > when > >>> >> queries are/appear slow, rather than need to repeat the same > mantra(s) > >>> for > >>> >> every inquiry on this topic.) > >>> >> > >>> >> > >>> >> -- Jack Krupansky > >>> >> > >>> >> On Sun, Mar 13, 2016 at 11:29 AM, Toke Eskildsen < > >>> t...@statsbiblioteket.dk> > >>> >> wrote: > >>> >> > >>> >> > Anil <anilk...@gmail.com> wrote: > >>> >> > > i have indexed a data (commands from files) with 10 fields and > 3 of > >>> >> them > >>> >> > is > >>> >> > > text fields. collection is created with 3 shards and 2 > replicas. I > >>> >> have > >>> >> > > used document routing as well. > >>> >> > > >>> >> > > Currently collection holds 47,80,01,405 records. > >>> >> > > >>> >> > ...480 million, right? Funny digit grouping in India. > >>> >> > > >>> >> > > text search against text field taking around 5 sec. solr is > query > >>> just > >>> >> > and > >>> >> > > of two terms with fl as 7 fields > >>> >> > > >>> >> > > fileId:"file unique id" AND command_text:(system login) > >>> >> > > >>> >> > While not an impressive response time, it might just be that your > >>> >> hardware > >>> >> > is not enough to handle that amount of documents. The usual > culprit > >>> is > >>> >> IO > >>> >> > speed, so chances are you have a system with spinning drives and > not > >>> >> enough > >>> >> > RAM: Switch to SSD and/or add more RAM. > >>> >> > > >>> >> > To give better advice, we need more information. > >>> >> > > >>> >> > * How large are your 3 shards in bytes? > >>> >> > * What storage system do you use (local SSD, local spinning > drives, > >>> >> remote > >>> >> > storage...)? > >>> >> > * How much physical memory does your system have? > >>> >> > * How much memory is free for disk cache? > >>> >> > * How many concurrent queries do you issue? > >>> >> > * Do you update while you search? > >>> >> > * What does a full query (rows, faceting, grouping, highlighting, > >>> >> > everything) look like? > >>> >> > * How many documents does a typical query match (hitcount)? > >>> >> > > >>> >> > - Toke Eskildsen > >>> >> > > >>> >> > >>> > > >>> > > >>> > >> > >> >