One quick question.... are you seeing any evictions from your filterCache? If so, it isn't set large enough to handle the faceting you're doing.

        Erik


On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:


I've been running load tests over the past week or 2, and I can't figure out my system's bottle neck that prevents me from increasing throughput. First I'll describe my Solr setup, then what I've tried to optimize the system.

I have 10 million records and 59 fields (all are indexed, 37 are stored, 17 have termVectors, 33 are multi-valued) which takes about 15GB of disk space. Most field values are very short (single word or number), and usually about half the fields have any data at all. I'm running on an 8-core, 64- bit, 32GB RAM Redhat box. I allocate about 24GB of memory to the java process, and my filterCache size is 700,000. I'm using a version of Solr between 1.3 and the current trunk (including the latest SOLR-667 (FastLRUCache) patch), and
Tomcat 6.0.

I'm running a ramp-test, increasing the number of users every few minutes. I measure the maximum number of requests that Solr can handle per second with a fixed response time, and call that my throughput. I'd like to see a single physical resource be maxed out at some point during my test so I know it is my bottle neck. I generated random queries for my dataset representing a more or less realistic scenario. The queries include faceting by up to 6
fields, and quering by up to 8 fields.

I ran a baseline on the un-optimized setup, and saw peak CPU usage of about 50%, IO usage around 5%, and negligible network traffic. Interestingly, the CPU peaked when I had 8 concurrent users, and actually dropped down to about 40% when I increased the users beyond 8. Is that because I have 8 cores?

I changed a few settings and observed the effect on throughput:

1. Increased filterCache size, and throughput increased by about 50%, but it
seems to peak.
2. Put the entire index on a RAM disk, and significantly reduced the average response time, but my throughput didn't change (i.e. even though my response time was 10X faster, the maximum number of requests I could make per second didn't increase). This makes no sense to me, unless there is another bottle
neck somewhere.
3. Reduced the number of records in my index. The throughput increased, but the shape of all my graphs stayed the same, and my CPU usage was identical.

I have a few questions:
1. Can I get more than 50% CPU utilization?
2. Why does CPU utilization fall when I make more than 8 concurrent
requests?
3. Is there an obvious bottleneck that I'm missing?
4. Does Tomcat have any settings that affect Solr performance?

Any input is greatly appreciated.

--
View this message in context: 
http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to