On 6/8/2018 8:59 AM, Markus Jelsma wrote: > 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [ ] o.a.s.s.HttpSolrCall > null:org.apache.solr.common.SolrException: Error trying to proxy request for > url: http://idx2:8983/solr/ > search/admin/ping <snip> > Caused by: org.eclipse.jetty.io.EofException
If you haven't tweaked the shard handler config to drastically reduce the socket timeout, that is weird. The only thing that comes to mind is extreme GC pauses that cause the socket timeout to be exceeded. > We operate three distinct type of Solr collections, they only share the same > Zookeeper quorum. The other two collections do not seem to have this problem, > but i don't restart those as often as i restart this collection, as i am > STILL trying to REPRODUCE the dreaded memory leak i reported having on 7.3 > about two weeks ago. Sorry, but i drives me nuts! I've reviewed the list messages about the leak. As you might imagine, my immediate thought is that the custom plugins you're running are probably the cause, because we are not getting OOME reports like I would expect if there were a leak in Solr itself. It would not be unheard of for a custom plugin to experience no leaks with one Solr version but leak when Solr is upgraded, requiring a change in the plugin to properly close resources. I do not know if that's what's happening. A leak could lead to GC pause problems, but it does seem really odd for that to happen on a Solr node that's just been started. You could try bumping the heap size by 25 to 50 percent and see if the behavior changes at all. Honestly I don't expect it to change, and if it doesn't, then I do not know what the next troubleshooting step should be. I could review your solr.log, though I can't be sure I would see something you didn't. Thanks, Shawn