I'm still pretty clueless trying to find the root cause of this behavior. One
thing is pretty consistent that whenever a node restarts up and sends a
recovery command, the recipient shard/replica goes down due to sudden surge
in old gen heap space. Within minutes, it hits the ceiling and stall the
server. And this keeps one going in circles. After moving to 7.5, we decided
to switch to G1 from CMS. We are using the recommended settings from Shawn's
blog.

GC_TUNE="-XX:+UseG1GC \
-XX:+PerfDisableSharedMem \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=250 \
-XX:InitiatingHeapOccupancyPercent=75 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
-XX:OnOutOfMemoryError=/mnt/ebs2/solrhome/bin/oom_solr.sh"

Can this be tuned better to avoid this?

Also, I'm curios to know if any 7.5 user has experienced similar scenario.
Can there be some major change related to recovery that I might be missing
after porting from 6.6?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to