Good questions.
Otis Gospodnetic wrote:

>Perhaps the container logs explain what happened

1) I can't find anything intersterting in the container logs. To the best of my knowledge, neither of the containers notice the drop. Jetty d show "out of threads" type errors before I tweaking the thread parameters. Once it was tuned a bit, I stopped seeing these entries in the log, but did not stop getting the errors.

How about just throttling to the point where the failure rate is 0%?  Too slow?


2) Throttling to 0 errors really slows things down. The last time I ran stats, performance scaled almost linearly with additional threads until we reached the approximate number of CPUs in the system. Anything above two threads shows progressively more error if I don't apply any throttling. The churn I need to keep up with makes that undesirable.

I'll put together some stats on insert rates, number of threads, and error rates and post them here. It's a classic trade off: tolerating poor results that require additional processing in exchange for higher performance. A set of heuristics for this situation might be useful, since I'm likely not the only one with an indexing bottleneck.

                     -Jim

Otis Gospodnetic wrote:
Perhaps the container logs explain what happened?
How about just throttling to the point where the failure rate is 0%?  Too slow?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
From: Paleo Tek <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, September 12, 2008 11:19:52 AM
Subject: No server response code on insert:  how do I avoid this at high speed?

I have a largish index with a lot of churn, and inserts that come in large bursts. My server is a multiprocessor with plenty of memory, so I can multi-thread and stuff in about 1.6 million records per hour, going full speed. I use a dozen or so threads to post curl inserts, and monitor the responses.

Using jetty, there is ~10% failure rate with no server response code received. Switching to tomcat reduces the error rate to around 2%. (which makes me like tomcat a lot, even though I'm a dog person...). I suspect I'm overrunning the capacity of the servlet container. Tweaking parameters in Jetty improved performance, and I can tune Tomcat. But then I'll just be overrunning a tuned system, at a slightly faster rate.

My work around is to keep track of which inserts fail, but I suspect there's a better approach. Any suggestions how I can balance maximum insert speed with a low error rate? Thanks!

          -Jim



Reply via email to