Hi, We are using solr cloud 6.1. We have around 20 collection on 4 nodes (We have 2 shards and each shard have 2 replicas). We have allocated 40 GB RAM to each shard.
Intermittently we found long GC pauses (60 sec to 200 sec) due to which solr stops responding and hence collections goes in recovering mode. It takes minimum 5-10 minutes (sometime it takes more and we have to restart the solr node) for recovering all collections. We are using default GC setting (CMS) as per solr.cmd. We tried different G1 GC to see if it help, but still we see long GC pauses(60 sec to 200 sec) and also found that memory usage is more in in case G1 GC. What could be reason for long GC pauses and how can fix it? Insufficient memory or problem with GC setting or something else? Any suggestion would be greatly appreciated. In our analysis, we also found some inefficient queries (which uses * many times in query) in solr logs. Could it be reason for high memory usage? Slow Query -------------- INFO (qtp1239731077-498778) [c:documents s:shard1 r:core_node1 x:documents] o.a.s.c.S.Request [documents] webapp=/solr path=/select params={df=summary&distrib=false&fl=id&shards.purpose=4&start=0&fsv=true&sort=description+asc,id+desc&fq=&shard.url=s1.asite.com:8983/solr/documents|s1r1.asite.com:8983/solr/documents&rows=250&version=2&q=((id:( REV78364_24705418+REV78364_24471492+REV78364_24471429+REV78364_24470771+REV78364_24470271+))+OR+summary:((HPC*+AND+*+AND+*+AND+OH1150*+AND+*+AND+*+AND+U0*+AND+*+AND+*+AND+HGS*+AND+*+AND+*+AND+MDL*+AND+*+AND+*+AND+100067*+AND+*+AND+-*+AND+Reinforcement*+AND+*+AND+Mode*)+))++AND++(title:((*HPC\+\-\+OH1150\+\-\+U0\+\-\+HGS\+\-\+MDL\+\-\+100067\+-\+Reinforcement\+Mode*)+))+AND+project_id:(-2+78243+78365+78364)+AND+is_active:true+AND+((isLatest:(true)+AND+isFolderActive:true+AND+isXref:false+AND+-document_type_id:(3+7)+AND+((is_public:true+OR+distribution_list:4858120+OR+folderadmin_list:4858120+OR+author_user_id:4858120)+AND+((defaultAccess:(true)+OR+allowedUsers:(4858120)+OR+allowedRoles:(6342201+172408+6336860)+OR+combinationUsers:(4858120))+AND+-blockedUsers:(4858120))))+OR+(isLatestRevPrivate:(true)+AND+allowedUsersForPvtRev:(4858120)+AND+-folderadmin_list:(4858120)))&shards.tolerant=true&NOW=1516786982952&isShard=true&wt=javabin} hits=0 status=0 QTime=83309 Regards, Maulin [CC Award Winners!]