For each of the solr machines/shards you have.  Thanks.

On Mon, Mar 14, 2016 at 10:04 AM, Susheel Kumar <susheel2...@gmail.com>
wrote:

> Hello Anil,
>
> Can you go to Solr Admin Panel -> Dashboard and share all 4 memory
> parameters under System / share the snapshot. ?
>
> Thanks,
> Susheel
>
> On Mon, Mar 14, 2016 at 5:36 AM, Anil <anilk...@gmail.com> wrote:
>
>> HI Toke and Jack,
>>
>> Please find the details below.
>>
>> * How large are your 3 shards in bytes? (total index across replicas)
>>           --  *146G. i am using CDH (cloudera), not sure how to check the
>> index size of each collection on each shard*
>> * What storage system do you use (local SSD, local spinning drives, remote
>> storage...)? *Local (hdfs) spinning drives*
>> * How much physical memory does your system have? *we have 15 data nodes.
>> multiple services installed on each data node (252 GB RAM for each data
>> node). 25 gb RAM allocated for solr service.*
>> * How much memory is free for disk cache? *i could not find.*
>> * How many concurrent queries do you issue? *very less. i dont see any
>> concurrent queries to this file_collection for now.*
>> * Do you update while you search? *Yes.. its very less.*
>> * What does a full query (rows, faceting, grouping, highlighting,
>> everything) look like? *for the file_collection, rows - 100, highlights =
>> false, no facets, expand = false.*
>> * How many documents does a typical query match (hitcount)? *it varies
>> with
>> each file. i have sort on int field to order commands in the query.*
>>
>> we have two sets of collections on solr cluster ( 17 data nodes)
>>
>> 1. main_collection - collection created per year. each collection uses 8
>> shards 2 replicas ex: main_collection_2016, main_collection_2015 etc
>>
>> 2. file_collection (where files having commands are indexed) - collection
>> created per 2 years. it uses 3 shards and 2 replicas. ex :
>> file_collection_2014, file_collection_2016
>>
>> The slowness is happening for file_collection. though it has 3 shards,
>> documents are available in 2 shards. shard1 - 150M docs and shard2 has
>> 330M
>> docs , shard3 is empty.
>>
>> main_collection is looks good.
>>
>> please let me know if you need any additional details.
>>
>> Regards,
>> Anil
>>
>>
>> On 13 March 2016 at 21:48, Anil <anilk...@gmail.com> wrote:
>>
>> > Thanks Toke and Jack.
>> >
>> > Jack,
>> >
>> > Yes. it is 480 million :)
>> >
>> > I will share the additional details soon. thanks.
>> >
>> >
>> > Regards,
>> > Anil
>> >
>> >
>> >
>> >
>> >
>> > On 13 March 2016 at 21:06, Jack Krupansky <jack.krupan...@gmail.com>
>> > wrote:
>> >
>> >> (We should have a wiki/doc page for the "usual list of suspects" when
>> >> queries are/appear slow, rather than need to repeat the same mantra(s)
>> for
>> >> every inquiry on this topic.)
>> >>
>> >>
>> >> -- Jack Krupansky
>> >>
>> >> On Sun, Mar 13, 2016 at 11:29 AM, Toke Eskildsen <
>> t...@statsbiblioteket.dk>
>> >> wrote:
>> >>
>> >> > Anil <anilk...@gmail.com> wrote:
>> >> > > i have indexed a data (commands from files) with 10 fields and 3 of
>> >> them
>> >> > is
>> >> > > text fields. collection is created with 3 shards and 2 replicas. I
>> >> have
>> >> > > used document routing as well.
>> >> >
>> >> > > Currently collection holds 47,80,01,405 records.
>> >> >
>> >> > ...480 million, right? Funny digit grouping in India.
>> >> >
>> >> > > text search against text field taking around 5 sec. solr is query
>> just
>> >> > and
>> >> > > of two terms with fl as 7 fields
>> >> >
>> >> > > fileId:"file unique id" AND command_text:(system login)
>> >> >
>> >> > While not an impressive response time, it might just be that your
>> >> hardware
>> >> > is not enough to handle that amount of documents. The usual culprit
>> is
>> >> IO
>> >> > speed, so chances are you have a system with spinning drives and not
>> >> enough
>> >> > RAM: Switch to SSD and/or add more RAM.
>> >> >
>> >> > To give better advice, we need more information.
>> >> >
>> >> > * How large are your 3 shards in bytes?
>> >> > * What storage system do you use (local SSD, local spinning drives,
>> >> remote
>> >> > storage...)?
>> >> > * How much physical memory does your system have?
>> >> > * How much memory is free for disk cache?
>> >> > * How many concurrent queries do you issue?
>> >> > * Do you update while you search?
>> >> > * What does a full query (rows, faceting, grouping, highlighting,
>> >> > everything) look like?
>> >> > * How many documents does a typical query match (hitcount)?
>> >> >
>> >> > - Toke Eskildsen
>> >> >
>> >>
>> >
>> >
>>
>
>

Reply via email to