Re: Huge Performance: Solr distributed search

Artem Lokotosh Thu, 24 Nov 2011 09:09:43 -0800

>How big are the documents you return (how many fields, avg KB per doc, etc.)?
I have a following schema in my solr configuration<fields><field
name="field1" type="text" indexed="true" stored="false"/><field
name="field2" type="text" indexed="true" stored="true"/><field
name="field3" type="text" indexed="true" stored="true"/><field
name="field4" type="tlong" indexed="true" stored="true"/><field
name="field5" type="tdate" indexed="true" stored="true"/><field
name="field6" type="text" indexed="true" stored="true"/><field
name="field7" type="text" indexed="true" stored="true"/><field
name="field8" type="tlong" indexed="true" stored="true"/><field
name="field9" type="text" indexed="true" stored="true"/><field
name="field10" type="tdate" indexed="true" stored="true"/><field
name="field11" type="text" indexed="true" stored="true"/><field
name="id" type="string" indexed="true" stored="true"
required="true"/></fields>
27M–30M docs and 12-15 GB for each shard, 0.5KB per doc
>Does performance get much better if you only request top 100, or top>10 
>documents instead of top 1000?
             |    10 |    100 |   1000 |    2000
-------------|-------|--------|--------|--------
MIN          |   124 |    146 |    237 |     747
AVG          |   832 |   4666 |  16130 |   72542
MAX          |  3602 |  30197 |  57339 |  159482
QUERIES/5MIN |    75 |     73 |     49 |      51
>>What if you only request a couple fields, instead of fl=*?>>What if you only 
>>search 10 shards instead of 30?
Results are similar to table above, btw I need to recieve all fields from shards
Another one problem.I use solrmeter or simple bash script to check the
search speed.I've got QTime from 16K to 24K for first ~20 queriesfrom
50K to 100K for next ~20 queries and until servlet goes down


On Wed, Nov 23, 2011 at 5:55 PM, Robert Stewart <bstewart...@gmail.com> wrote:
> If you request 1000 docs from each shard, then aggregator is really
> fetching 30,000 total documents, which then it must merge (re-sort
> results, and take top 1000 to return to client).  Its possible that
> SOLR merging implementation needs optimized, but it does not seem like
> it could be that slow.  How big are the documents you return (how many
> fields, avg KB per doc, etc.)?  I would take a look at network to make
> sure that is not some bottleneck, and also to make sure there is not
> some underlying issue making 30 concurrent HTTP requests from the
> aggregator.  I am not an expert in Java, but under .NET there is a
> setting that limits concurrent out-going HTTP requests from a process
> that must be over-ridden via configuration, otherwise by default is
> very limiting.
>
> Does performance get much better if you only request top 100, or top
> 10 documents instead of top 1000?
>
> What if you only request a couple fields, instead of fl=*?
>
> What if you only search 10 shards instead of 30?
>
> I would collect those numbers and try to determine if time increases
> linearly or not as you increase shards and/or # of docs.
>
>
>
>
>
> On Wed, Nov 23, 2011 at 9:55 AM, Artem Lokotosh <arco...@gmail.com> wrote:
>>> If the response time from each shard shows decent figures, then aggregator> 
>>> seems to be a bottleneck. Do you btw have a lot of concurrent users?For now 
>>> is not a problem, but we expect from 1K to 10K of concurrent users and 
>>> maybe more
>> On Wed, Nov 23, 2011 at 4:43 PM, Dmitry Kan <dmitry....@gmail.com> wrote:
>>> If the response time from each shard shows decent figures, then aggregator
>>> seems to be a bottleneck. Do you btw have a lot of concurrent users?
>>>
>>> On Wed, Nov 23, 2011 at 4:38 PM, Artem Lokotosh <arco...@gmail.com> wrote:
>>>
>>>> > Is this log from the frontend SOLR (aggregator) or from a shard?
>>>> from aggregator
>>>>
>>>> > Can you merge, e.g. 3 shards together or is it much effort for your team?
>>>> Yes, we can merge. We'll try to do this and review how it will works
>>>> Thanks, Dmitry
>>>>
>>>> Any another ideas?
>>>>
>>
>> --
>> Best regards,
>> Artem Lokotosh        mailto:arco...@gmail.com
>>
>



-- 
Best regards,
Artem Lokotosh        mailto:arco...@gmail.com

Re: Huge Performance: Solr distributed search

Reply via email to