One quick question.... are you seeing any evictions from your
filterCache? If so, it isn't set large enough to handle the faceting
you're doing.
Erik
On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:
I've been running load tests over the past week or 2, and I can't
figure out
my system's bottle neck that prevents me from increasing throughput.
First
I'll describe my Solr setup, then what I've tried to optimize the
system.
I have 10 million records and 59 fields (all are indexed, 37 are
stored, 17
have termVectors, 33 are multi-valued) which takes about 15GB of
disk space.
Most field values are very short (single word or number), and
usually about
half the fields have any data at all. I'm running on an 8-core, 64-
bit, 32GB
RAM Redhat box. I allocate about 24GB of memory to the java process,
and my
filterCache size is 700,000. I'm using a version of Solr between 1.3
and the
current trunk (including the latest SOLR-667 (FastLRUCache) patch),
and
Tomcat 6.0.
I'm running a ramp-test, increasing the number of users every few
minutes. I
measure the maximum number of requests that Solr can handle per
second with
a fixed response time, and call that my throughput. I'd like to see
a single
physical resource be maxed out at some point during my test so I
know it is
my bottle neck. I generated random queries for my dataset
representing a
more or less realistic scenario. The queries include faceting by up
to 6
fields, and quering by up to 8 fields.
I ran a baseline on the un-optimized setup, and saw peak CPU usage
of about
50%, IO usage around 5%, and negligible network traffic.
Interestingly, the
CPU peaked when I had 8 concurrent users, and actually dropped down
to about
40% when I increased the users beyond 8. Is that because I have 8
cores?
I changed a few settings and observed the effect on throughput:
1. Increased filterCache size, and throughput increased by about
50%, but it
seems to peak.
2. Put the entire index on a RAM disk, and significantly reduced the
average
response time, but my throughput didn't change (i.e. even though my
response
time was 10X faster, the maximum number of requests I could make per
second
didn't increase). This makes no sense to me, unless there is another
bottle
neck somewhere.
3. Reduced the number of records in my index. The throughput
increased, but
the shape of all my graphs stayed the same, and my CPU usage was
identical.
I have a few questions:
1. Can I get more than 50% CPU utilization?
2. Why does CPU utilization fall when I make more than 8 concurrent
requests?
3. Is there an obvious bottleneck that I'm missing?
4. Does Tomcat have any settings that affect Solr performance?
Any input is greatly appreciated.
--
View this message in context:
http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
Sent from the Solr - User mailing list archive at Nabble.com.