On 10/27/2011 5:56 AM, Michael Sokolov wrote:
From everything you've said, it certainly sounds like a low-level I/O problem in the client, not a server slowdown of any sort. Maybe Perl is using the same connection over and over (keep-alive) and Java is not. I really don't know. One thing I've heard is that StreamingUpdateSolrServer (I think that's what it's called) can give better throughput for large request batches. If you're not using that, you may be having problems w/closing and re-opening connections?

I turned off the perl build system and had the Java program take over full build duties for both index chains. It's been designed so one copy of the program can keep any number of index chains up to date simultaneously.

On the most recently hourly run, the servers without virtualization took 50 seconds, the servers with virtualization and more memory took only 16 seconds, so it looks like this problem has nothing to do with SolrJ, it's due to the 1000 clause queries actually taking a long time to execute. The 16 second runtime is still longer than the last run by the perl program (12 seconds), but I am also executing an index rebuild in the build cores on those servers, so I'm not overly concerned by that.

At this point there isn't any way for me to know whether the speedup with the old server builds is due to the extra memory (OS disk cache) or due to some quirk of virtualization. I'm really hoping it's due to the extra memory, because I really don't want to go back to a virtualized environment. I'll be able to figure it out after I eliminate my current bug and complete the migration.

Thank you very much to everyone who offered assistance. It helped me make sure my testing was as unbiased as I could achieve.

Shawn

Reply via email to