Are you using 3rd-party plugins?
> We have two slaves replicating off one master every 2 minutes. > > Both using the CMS + ParNew Garbage collector. Specifically > > -server -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing > > but periodically they both get into a GC storm and just keel over. > > Looking through the GC logs the amount of memory reclaimed in each GC > run gets less and less until we get a concurrent mode failure and then > Solr effectively dies. > > Is it possible there's a memory leak? I note that later versions of > Lucene have fixed a few leaks. Our current versions are relatively old > > Solr Implementation Version: 1.4.1 955763M - mark - 2010-06-17 > 18:06:42 > > Lucene Implementation Version: 2.9.3 951790 - 2010-06-06 01:30:55 > > so I'm wondering if upgrading to later version of Lucene might help (of > course it might not but I'm trying to investigate all options at this > point). If so what's the best way to go about this? Can I just grab the > Lucene jars and drop them somewhere (or unpack and then repack the solr > war file?). Or should I use a nightly solr 1.4? > > Or am I barking up completely the wrong tree? I'm trawling through heap > logs and gc logs at the moment trying to to see what other tuning I can > do but any other hints, tips, tricks or cluebats gratefully received. > Even if it's just "Yeah, we had that problem and we added more slaves > and periodically restarted them" > > thanks, > > Simon