Re: How to measure search performance

Erick Erickson Thu, 23 Jul 2020 10:53:17 -0700

This isn’t usually a cause for concern. Clearing the caches doesn’t necessarily 
clear the OS caches for instance. I think you’re already aware that Lucene uses 
MMapDirectory, meaning the index pages are mapped to OS memory space. Whether 
those pages are actually _in_ the OS physical memory or not is anyone’s guess 
so depending on when they’re needed they might have to be read from disk. This 
is entirely independent of Solr’s caches, and could come into play even if you 
restarted Solr.


Then there’s your function queries for the pseudo fields. This is read from the 
docValues sections of the index. Once again the relevant parts of the index may 
or may not be in the OS memory.

So comparing individual queries is “fraught” with uncertainties. I suppose you 
could reboot the machines each time ;) I’ve only ever had luck averaging a 
bunch of unique queries when trying to measure perf differences.

Do note that function queries for pseudo fields is not something I’d expect to 
add much overhead at all. The reason is that they’re only called for the top N 
docs that you’re returning, not part of the search at all. Consider a function 
query involved in scoring. That one must be called for every document that 
matches. But a function query for a pseudo field is only called for the docs 
returned in the packet, i.e. the “rows” parameter.

Best,
Erick

> On Jul 23, 2020, at 11:49 AM, Webster Homer 
> <webster.ho...@milliporesigma.com> wrote:
> 
> I'm trying to determine the overhead of adding some pseudo fields to one of 
> our standard searches. The pseudo fields are simply function queries to 
> report if certain fields matched the query or not. I had thought that I could 
> run the search without the change and then re-run the searches with the 
> fields added.
> I had assumed that the QTime in the query response would be a good metric to 
> use when comparing the performance of the two search queries. However I see 
> that the QTime for a query can vary by more than 10%. When testing I cleared 
> the query cache between tests. Usually the QTime would be within a few 
> milliseconds of each other, however in some cases there was a 10X or more 
> difference between them.
> Even cached queries vary in their QTime, though much less.
> 
> I am running Solr 7.7.2 in a solrcloud configuration with 2 shards and 2 
> replicas/shard. Our nodes have 32Gb memory and 16GB of heap allocated to solr.
> 
> I am concerned that these discrepancies indicate that our system is not tuned 
> well enough.
> Should I expect that a query's QTime really is a measure of the query's 
> inherent performance? Is there a better way to measure query performance?
> 
> 
> 
> 
> 
> This message and any attachment are confidential and may be privileged or 
> otherwise protected from disclosure. If you are not the intended recipient, 
> you must not copy this message or attachment or disclose the contents to any 
> other person. If you have received this transmission in error, please notify 
> the sender immediately and delete the message and any attachment from your 
> system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> accept liability for any omissions or errors in this message which may arise 
> as a result of E-Mail-transmission or for damages resulting from any 
> unauthorized changes of the content of this message and any attachment 
> thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not 
> guarantee that this message is free of viruses and does not accept liability 
> for any damages caused by any virus transmitted therewith.
> 
> 
> 
> Click http://www.merckgroup.com/disclaimer to access the German, French, 
> Spanish and Portuguese versions of this disclaimer.

Re: How to measure search performance

Reply via email to