Re: Reference numbers for major page fauls per seconds, index size, query throughput

Harald Kirsch Mon, 14 Jul 2014 04:44:12 -0700

Hello Erik,

thanks for the reply. Indeed the CPUs are kind of idling during the loadtest. They are not <20% but clearly don't get far beyond 40%.

Changing the number of threads in jmeter has minor effects only on theqps, but increases the average latency, as soon as the threads outnumberthe CPUs --- expected behavior I would say.

I varied the number of results returned between 20 and 10 with noremarkable changes in performance.

I restricted to fl=id and even this increased the throughput onlyminimally (meantime the index has 16 million, increase from 2.x qps to3). Jmeter reported a reduction in average transferred size from 10kByesto 2.5kBytes. This is not really the issue here and in the end we needmore than the IDs in production anyway.

What really bugs me currently is that htop reports an IORR (supposed tobe read(2) calls) of between 100 to 200 MByte/s during the load test.

This somehow runs contrary to my understanding of why Solr uses mmappedfiles. There should be no read(2) calls and certainly not 200 MB/s :-/And this did not drop when I restricted to fl=id.


I will try to check this with strace to see were it is reading from.

Hints appreciated. With a bit of luck, I'll get more RAM and can comparethen.


Thanks,
Harald.


On 12.07.2014 17:58, Erick Erickson wrote:

If the stats you're reporting are during the load test, your CPU is
kind of idling along at < 20% which supports your theory.

Just to cover all bases, when you bump the number of threads jmeter is
firing does it make any difference? And how many rows are you
returning? This latter is important because to return documents, Solr
needs to go out to disk, possibly generating your page faults
(guessing here).

One note about your index size.... it's largely useless to measure
index on disk if for no other reason than the _stored_ data doesn't
really count towards memory requirements for search. The *.fdt an
d*.fdx segment files contain the stored data, so subtract them out....

Speaking of which, try just returning the id (&fl=id). That should
reduce the disk seeks due to assembling the docs.

But 4 qps for simple term queries seems very slow at first blush.

FWIW,
Erick

On Thu, Jul 10, 2014 at 7:30 AM, Harald Kirsch
<harald.kir...@raytion.com> wrote:

Hi everyone,

currently I am taking some performance measurements on a Solr installation
and I am trying to figure out if what I see mostly fits expectations:

The data is as follows:

- solr 4.8.1
- 8 millon documents
- mostly office documents with real text content, stored
- index size on disk 90G
- full index memory mapped into virtual memory:
- this is a on a vmware server, 4 cores, 16 GB RAM

PID PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  nFLT
961 20   0 93.9g  10g 6.0g S     19 64.5 718:39.81 757k

When I start running a jmeter query test sending requests as fast a possible
with a few threads, it peaks at about 4 qps with a real-world query replay
of mostly 1, 2, sometimes more terms.

What I see are around 150 to 200 major page faults per second, meaning that
Solr is not really happy with what happens to be in memory at any instance
in time.

My hunch is that this hints at a too small RAM footprint. Much more RAM is
needed to get the number of major page faults down.

Would anyone agree or disagree with this analysis. Someone out there saying
"200 major page faults/second are normal, there must be another problem"?

Thanks,
Harald.

Re: Reference numbers for major page fauls per seconds, index size, query throughput

Reply via email to