>From a high level view, there is a certain amount of garbage collection
that must occur. That garbage is generated per request, through a
variety of means (buffers, request, response, cache expulsion). The only
thing that JVM parameters can address is *when* that collection occurs. 

It can occur often in small chunks, or rarely in large chunks (or
anywhere in between). If you are CPU bound (which it sounds like you may
be), then you really have a decision to make. Do you want an overall
drop in performance, as more time is spent garbage collecting, OR do you
want spikes in garbage collection that are more rare, but have a
stronger impact. Realistically it becomes a question of one or the
other. You *must* pay the cost of garbage collection at some point in
time.

It is possible that increasing cache size will decrease overall garbage
collection, as the churn caused by caused by cache misses creates
additional garbage. Decreasing the churn could decrease garbage. BUT,
this really depends on your cache hit rates. If they are pretty high
(>90%) then it's probably not much of a factor. However, if you are in
the 50%-60% range, larger caches may help you in a number of ways.

-Todd Feak

-----Original Message-----
From: wojtekpia [mailto:wojte...@hotmail.com] 
Sent: Wednesday, January 21, 2009 11:14 AM
To: solr-user@lucene.apache.org
Subject: Re: Performance "dead-zone" due to garbage collection


I'm using a recent version of Sun's JVM (6 update 7) and am using the
concurrent generational collector. I've tried several other collectors,
none
seemed to help the situation.

I've tried reducing my heap allocation. The search performance got worse
as
I reduced the heap. I didn't monitor the garbage collector in those
tests,
but I imagine that it would've gotten better. (As a side note, I do lots
of
faceting and sorting, I have 10M records in this index, with an
approximate
index file size of 10GB).

This index is on a single machine, in a single Solr core. Would
splitting it
across multiple Solr cores on a single machine help? I'd like to find
the
limit of this machine before spreading the data to more machines.

Thanks,

Wojtek
-- 
View this message in context:
http://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collect
ion-tp21588427p21590150.html
Sent from the Solr - User mailing list archive at Nabble.com.


Reply via email to