Re: Question about filter query: "half" of my index is slower than the other?

Erick Erickson Fri, 09 Aug 2013 17:12:50 -0700

To add to what Shawn said, this filterCache is enormous. The key statistics
are
the hit ratio and evictions. Evictions aren't bad if the hit ratio is high.
If hit ratio is
low and evictions are high, only then should you consider making it larger.
So
I'd drop it back to 512.


Hit ratios around 75% are my personal "too low" number, but YMMV...

BUT, it's an LRU cache. So assuming you're forming a filter query for the
two "sides" and that you append an fq clause to every query,
you'll only need two entries <G>. Plus, of course, other fqs.

The first thing I'd do is only return 10 rows, turn off highlighting, and
anything
else that comes to mind. Then add them back and see which ones
are causing you grief.

Or add &debug=timing. That'll return a list of how much time each
component takes and may give you a clue as well.

Best
Erick


On Fri, Aug 9, 2013 at 1:55 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 8/9/2013 9:36 AM, Neal Ensor wrote:
>
>> I have an 8 million document solr index, roughly divided down the middle
>> by
>> an identifying "product" value, one of two distinct values.  The documents
>> in both "sides" are very similar, with stored text fields, etc.  I have
>> two
>> nearly identical request handlers, one for each "side".
>>
>> When I perform very similar queries on either "side" for random phrases,
>> requesting 500 rows with highlighting on titles and summaries, I get very
>> different results.  One "side" consistently returns results in around 1-2
>> seconds, whereas the other one consistently returns in 6-10 seconds.  I
>> don't see any reason why it's worse; each run of queries is deliberately
>> randomized to avoid caches getting in the way.  Each test query returns
>> the
>> full first 500 in most cases.
>>
>
>  My filter query cache configuration looks like:
>>
>> <filterCache class="solr.FastLRUCache"
>>                   size="750000"
>>                   initialSize="10000"
>>                   autowarmCount="0"/>
>>
>
> This filterCache is *enormous* ... even the initialSize is larger than I
> would normally expect to see for the total size.  With 8 million documents,
> each entry in the cache can be 1 megabyte, and in practice, the entry will
> be either very small or it will be the full 1 megabyte ... depending on how
> many documents get matched by a filter. This has the potential to chew up a
> lot of RAM without really doing much for you.
>
> If the same problem happens when you drastically reduce the size of
> filterCache, I suspect basic performance problems.  Even 1-2 seconds seems
> very slow to me.
>
> The first questions I have are some statistics about your index and the
> server you're running it on.  How big is that index in terms of disk space?
>  How much RAM are you allocating to the JVM?  How much RAM is in the entire
> machine?  Is the machine running software other than Solr, such as a web
> server, database server, etc?  What operating system are you running on, is
> it 64 bit, and is Java 64 bit?
>
> Next, I'd like to know more about your queries.  Can you include typical
> examples of all query parameters for both "sides"?  What does the indexed
> and stored data look like for a typical document?  Depending on what I
> learn here, I might need to see all or part of your config and schema.
>
> How often do you send updates/deletes to your index?  How often and
> exactly how are you doing commits, and do you have any auto commit in your
> config?
>
> Thanks,
> Shawn
>
>

Re: Question about filter query: "half" of my index is slower than the other?

Reply via email to