Good questions.
Otis Gospodnetic wrote:
>Perhaps the container logs explain what happened
1) I can't find anything intersterting in the container logs. To the
best of my knowledge, neither of the containers notice the drop. Jetty
d show "out of threads" type errors before I tweaking the thread
parameters. Once it was tuned a bit, I stopped seeing these entries in
the log, but did not stop getting the errors.
How about just throttling to the point where the failure rate is 0%? Too slow?
2) Throttling to 0 errors really slows things down. The last time I ran
stats, performance scaled almost linearly with additional threads until
we reached the approximate number of CPUs in the system. Anything above
two threads shows progressively more error if I don't apply any
throttling. The churn I need to keep up with makes that undesirable.
I'll put together some stats on insert rates, number of threads, and
error rates and post them here. It's a classic trade off: tolerating
poor results that require additional processing in exchange for higher
performance. A set of heuristics for this situation might be useful,
since I'm likely not the only one with an indexing bottleneck.
-Jim
Otis Gospodnetic wrote:
Perhaps the container logs explain what happened?
How about just throttling to the point where the failure rate is 0%? Too slow?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ----
From: Paleo Tek <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, September 12, 2008 11:19:52 AM
Subject: No server response code on insert: how do I avoid this at high speed?
I have a largish index with a lot of churn, and inserts that come in
large bursts. My server is a multiprocessor with plenty of memory, so I
can multi-thread and stuff in about 1.6 million records per hour, going
full speed. I use a dozen or so threads to post curl inserts, and
monitor the responses.
Using jetty, there is ~10% failure rate with no server response code
received. Switching to tomcat reduces the error rate to around 2%.
(which makes me like tomcat a lot, even though I'm a dog person...). I
suspect I'm overrunning the capacity of the servlet container. Tweaking
parameters in Jetty improved performance, and I can tune Tomcat. But
then I'll just be overrunning a tuned system, at a slightly faster rate.
My work around is to keep track of which inserts fail, but I suspect
there's a better approach. Any suggestions how I can balance maximum
insert speed with a low error rate? Thanks!
-Jim